Jun 17 – 21, 2024
ENSEEIHT
Europe/Paris timezone

Session

Keynote: Bruno Gaujal (INRIA)

Jun 20, 2024, 11:00 AM
Amphi B00 (ENSEEIHT)

Amphi B00

ENSEEIHT

Presentation materials

There are no materials yet.

  1. 6/20/24, 11:00 AM

    Optimistic reinforcement learning algorithms in Markov decision processes essentially rely on two ingredients to guarantee regret efficiency. The first one is the choice of well-tuned confidence bounds and the second is the design of a pertinent rule to end episodes. While many efforts have been dedicated to improve the tightness of confidence bounds, the management of episodes has remained...

    Go to contribution page
Building timetable...