Reinforcement Learning for Stochastic Networks, Toulouse

Name: Reinforcement Learning for Stochastic Networks, Toulouse
Start: 2024-06-17T09:00:00+02:00
End: 2024-06-21T18:00:00+02:00
Location: ENSEEIHT

Jun 17 – 21, 2024

ENSEEIHT

Europe/Paris timezone

Session

Parallel session: Online learning

Jun 19, 2024, 1:30 PM

A002 (ENSEEIHT)

A002

ENSEEIHT

There are no materials yet.

64. Model-Free Robust $\phi$-Divergence Reinforcement Learning Using Both Offline and Online Data

Kishan Panaganti (California Institute of Technology)

6/19/24, 1:30 PM

The robust $\phi$-regularized Markov Decision Process (RRMDP) framework focuses on designing control policies that are robust against parameter uncertainties due to mismatches between the simulator (nominal) model and real-world settings. This work makes \emph{two} important contributions. First, we propose a \textit{model-free} algorithm called \textit{Robust $\phi$-regularized fitted...
Go to contribution page
69. Score-Aware Policy-Gradient Methods and Performance Guarantees using Local Lyapunov Conditions

Albert Senen–Cerda (IRIT, LAAS–CNRS, and Université de Toulouse)

6/19/24, 2:00 PM

In this talk, we introduce a policy-gradient method for model-based Reinfocement Learning (RL) that exploits a type of stationary distribution commonly obtained from Markov Decision Processes (MDPs) in stochastic networks, queueing systems and statistical mechanics.
Specifically, when the stationary distribution of the MDP belongs to an exponential family that is parametrized by policy...
Go to contribution page
73. Tabular and Deep Reinforcement learning for Gittins Index

Tejas Bodas (IIIT Hyderabad)

6/19/24, 2:30 PM

In the realm of multi-arm bandit problems, the Gittins index policy is known to be optimal in maximizing the expected total discounted reward obtained from pulling the Markovian arms. In most realistic scenarios however, the Markovian state transition probabilities are unknown and therefore the Gittins indices cannot be computed. One can then resort to reinforcement learning (RL) algorithms...
Go to contribution page

Building timetable...

Reinforcement Learning for Stochastic Networks, Toulouse

Session

Parallel session: Online learning

A002

ENSEEIHT

Description

Presentation materials

Choose timezone

Reinforcement Learning for Stochastic Networks, Toulouse

Description

Presentation materials