Reinforcement Learning for Stochastic Networks, Toulouse

Name: Reinforcement Learning for Stochastic Networks, Toulouse
Start: 2024-06-17T09:00:00+02:00
End: 2024-06-21T18:00:00+02:00
Location: ENSEEIHT

Jun 17 – 21, 2024

ENSEEIHT

Europe/Paris timezone

Exploiting Structure in Undiscounted Reinforcement Learning in Markov Decision Processes

Jun 17, 2024, 1:30 PM

30m

A002 (ENSEEIHT)

A002

ENSEEIHT

Parallel session: Challenges and progress in statistical reinforcement learning

Dr Ronald Ortner (MontanUniversitat Leoben)

This talk considers reinforcement learning in Markov decision processes
(MDPs) under the undiscounted reward criterion. In this setting the
so-called regret is a natural performance measure that compares the
accumulated reward of the learner to that of an optimal policy. Usually
the regret depends on the size (number of states and actions) of the
underlying MDP as well as its transition structure. We will examine
structures of the underlying MDP that allow to give improved bounds on
the regret.

Dr Ronald Ortner (MontanUniversitat Leoben)

There are no materials yet.

Reinforcement Learning for Stochastic Networks, Toulouse

Exploiting Structure in Undiscounted Reinforcement Learning in Markov Decision Processes

A002

ENSEEIHT

Speaker

Description

Author

Presentation materials

Choose timezone

Reinforcement Learning for Stochastic Networks, Toulouse

Speaker

Description

Author

Presentation materials