Jun 17 – 21, 2024
ENSEEIHT
Europe/Paris timezone

Session

Parallel session: Online learning

Jun 19, 2024, 1:30 PM
A002 (ENSEEIHT)

A002

ENSEEIHT

Description

Session chair: Tejas Bodas

Presentation materials

There are no materials yet.

  1. Kishan Panaganti (California Institute of Technology)
    6/19/24, 1:30 PM

    The robust $\phi$-regularized Markov Decision Process (RRMDP) framework focuses on designing control policies that are robust against parameter uncertainties due to mismatches between the simulator (nominal) model and real-world settings. This work makes \emph{two} important contributions. First, we propose a \textit{model-free} algorithm called \textit{Robust $\phi$-regularized fitted...

    Go to contribution page
  2. Albert Senen–Cerda (IRIT, LAAS–CNRS, and Université de Toulouse)
    6/19/24, 2:00 PM

    In this talk, we introduce a policy-gradient method for model-based Reinfocement Learning (RL) that exploits a type of stationary distribution commonly obtained from Markov Decision Processes (MDPs) in stochastic networks, queueing systems and statistical mechanics.
    Specifically, when the stationary distribution of the MDP belongs to an exponential family that is parametrized by policy...

    Go to contribution page
  3. Tejas Bodas (IIIT Hyderabad)
    6/19/24, 2:30 PM

    In the realm of multi-arm bandit problems, the Gittins index policy is known to be optimal in maximizing the expected total discounted reward obtained from pulling the Markovian arms. In most realistic scenarios however, the Markovian state transition probabilities are unknown and therefore the Gittins indices cannot be computed. One can then resort to reinforcement learning (RL) algorithms...

    Go to contribution page
Building timetable...