Reinforcement Learning for Stochastic Networks, Toulouse

Name: Reinforcement Learning for Stochastic Networks, Toulouse
Start: 2024-06-17T09:00:00+02:00
End: 2024-06-21T18:00:00+02:00
Location: ENSEEIHT

Jun 17 – 21, 2024

ENSEEIHT

Europe/Paris timezone

The Projected Bellman Equation in Reinforcement Learning

Jun 21, 2024, 2:00 PM

Amphi B00 (ENSEEIHT)

Amphi B00

ENSEEIHT

Keynote: Sean Meyn (University of Florida)

Abstract: A topic of discussion throughout the 2020 Simons program on reinforcement learning: is the Q-learning algorithm convergent outside of the tabular setting? It is now known that stability can be assured using a matrix gain algorithm, but this requires assumptions, which begs the next question: does a solution to the projected Bellman equation exist? This is the most minimal requirement for convergence of any algorithm.

The question was resolved in very recent work. A solution does exist, subject to two assumptions: the function class is linear, and (far more crucial) the input used for training is a form of epsilon-greedy policy with sufficiently small epsilon. Moreover, under these conditions it is shown that the Q-learning algorithm is stable, in terms of bounded parameter estimates. Convergence remains one of many open topics for research.

In short, sufficient optimism is not only valuable for algorithmic efficiency, but is a means to algorithmic stability.

There are no materials yet.

Reinforcement Learning for Stochastic Networks, Toulouse

The Projected Bellman Equation in Reinforcement Learning

Amphi B00

ENSEEIHT

Description

Presentation materials

Choose timezone

Reinforcement Learning for Stochastic Networks, Toulouse

Description

Presentation materials