Choose timezone

Your profile timezone:

Use timezone based on: Event/category Custom

Select a custom timezone

Reinforcement Learning for Stochastic Networks, Toulouse

Name: Reinforcement Learning for Stochastic Networks, Toulouse
Start: 2024-06-17T09:00:00+02:00
End: 2024-06-21T18:00:00+02:00
Location: ENSEEIHT

Jun 17 – 21, 2024

ENSEEIHT

Europe/Paris timezone

Session

Parallel session: Online learning

Jun 19, 2024, 1:30 PM

A002 (ENSEEIHT)

A002

ENSEEIHT

There are no materials yet.

64. Model-Free Robust

ϕ

-Divergence Reinforcement Learning Using Both Offline and Online Data

Kishan Panaganti (California Institute of Technology)

6/19/24, 1:30 PM

The robust $ϕ$ -regularized Markov Decision Process (RRMDP) framework focuses on designing control policies that are robust against parameter uncertainties due to mismatches between the simulator (nominal) model and real-world settings. This work makes \emph{two} important contributions. First, we propose a \textit{model-free} algorithm called \textit{Robust $ϕ$ -regularized fitted...

69. Score-Aware Policy-Gradient Methods and Performance Guarantees using Local Lyapunov Conditions

Albert Senen–Cerda (IRIT, LAAS–CNRS, and Université de Toulouse)

6/19/24, 2:00 PM

In this talk, we introduce a policy-gradient method for model-based Reinfocement Learning (RL) that exploits a type of stationary distribution commonly obtained from Markov Decision Processes (MDPs) in stochastic networks, queueing systems and statistical mechanics.
Specifically, when the stationary distribution of the MDP belongs to an exponential family that is parametrized by policy...

73. Tabular and Deep Reinforcement learning for Gittins Index

Tejas Bodas (IIIT Hyderabad)

6/19/24, 2:30 PM

In the realm of multi-arm bandit problems, the Gittins index policy is known to be optimal in maximizing the expected total discounted reward obtained from pulling the Markovian arms. In most realistic scenarios however, the Markovian state transition probabilities are unknown and therefore the Gittins indices cannot be computed. One can then resort to reinforcement learning (RL) algorithms...

Previous tabNext tab

mer. 19/06

PDF

Full screen

Detailed view

Filter

13:00

14:00

15:00

Kishan Panaganti

Model-Free Robust $\phi$-Divergence Reinforcement Learning Using Both Offline and Online Data

A002, ENSEEIHT

13:30 - 14:00

Albert Senen–Cerda

Score-Aware Policy-Gradient Methods and Performance Guarantees using Local Lyapunov Conditions

A002, ENSEEIHT

14:00 - 14:30

Tejas Bodas

Tabular and Deep Reinforcement learning for Gittins Index

A002, ENSEEIHT

14:30 - 15:00

Updating the timetable...

Reinforcement Learning for Stochastic Networks, Toulouse

Session

Parallel session: Online learning

A002

ENSEEIHT

Description

Presentation materials