Reinforcement Learning for Stochastic Networks, Toulouse

Name: Reinforcement Learning for Stochastic Networks, Toulouse
Start: 2024-06-17T09:00:00+02:00
End: 2024-06-21T18:00:00+02:00
Location: ENSEEIHT

Jun 17 – 21, 2024

ENSEEIHT

Europe/Paris timezone

Symphony of experts: orchestration with adversarial insights in reinforcement learning

Jun 17, 2024, 4:30 PM

30m

A001 (ENSEEIHT)

A001

ENSEEIHT

Parallel session: Reinforcement learning in continuous time

Chiara Mignacco (Université Paris-Saclay)

Structured reinforcement learning leverages policies with advantageous properties to reach better performance, particularly in scenarios where exploration poses challenges. We explore this field through the concept of orchestration, where a (small) set of expert policies guides decision-making; the modeling thereof constitutes our first contribution. We then establish value-functions regret bounds for orchestration in the tabular setting by transferring regret-bound results from adversarial settings. We generalize and extend the analysis of natural policy gradient in Agarwal et al. [2021, Section 5.3] to arbitrary adversarial aggregation strategies. We also extend it to the case of estimated advantage functions, providing insights into sample complexity both in expectation and high probability. A key point of our approach lies in its arguably more transparent proofs compared to existing methods. Finally, we provide simulations for a stochastic matching toy model.

Chiara Mignacco (Université Paris-Saclay) Gilles Stoltz (Université Paris-Saclay) Matthieu Jonckheere (LAAS–CNRS)

There are no materials yet.

Reinforcement Learning for Stochastic Networks, Toulouse

Symphony of experts: orchestration with adversarial insights in reinforcement learning

A001

ENSEEIHT

Speaker

Description

Primary authors

Presentation materials

Choose timezone

Reinforcement Learning for Stochastic Networks, Toulouse

Speaker

Description

Primary authors

Presentation materials