Reinforcement Learning for Stochastic Networks, Toulouse

Name: Reinforcement Learning for Stochastic Networks, Toulouse
Start: 2024-06-17T09:00:00+02:00
End: 2024-06-21T18:00:00+02:00
Location: ENSEEIHT

Jun 17 – 21, 2024

ENSEEIHT

Europe/Paris timezone

A Doubly Robust Approach to Sparse Reinforcement Learning

Jun 18, 2024, 2:30 PM

30m

A001 (ENSEEIHT)

A001

ENSEEIHT

Parallel session: Reinforcement learning in MDPs with large state spaces

Assaf Zeevi (columbia university)

We propose a new regret minimization algorithm for episodic sparse linear Markov decision process (SMDP) where the state-transition distribution is a linear function of observed features.
The only previously known algorithm for SMDP requires the knowledge of the sparsity parameter and oracle access to a reference policy.
We overcome these limitations by combining the doubly robust method that allows one to use feature vectors of \emph{all} actions with a novel analysis technique that enables the algorithm to use data from all periods in all episodes. This algorithm is shown to achieve best possible regret (up to log factors)

Assaf Zeevi (columbia university)

There are no materials yet.

Reinforcement Learning for Stochastic Networks, Toulouse

A Doubly Robust Approach to Sparse Reinforcement Learning

A001

ENSEEIHT

Speaker

Description

Primary author

Presentation materials

Choose timezone

Reinforcement Learning for Stochastic Networks, Toulouse

Speaker

Description

Primary author

Presentation materials