Reinforcement Learning for Stochastic Networks, Toulouse

Name: Reinforcement Learning for Stochastic Networks, Toulouse
Start: 2024-06-17T09:00:00+02:00
End: 2024-06-21T18:00:00+02:00
Location: ENSEEIHT

Jun 17 – 21, 2024

ENSEEIHT

Europe/Paris timezone

Learning LP-indices in Average-Reward Restless Multi-Armed Bandits

Jun 19, 2024, 1:30 PM

30m

A001 (ENSEEIHT)

A001

ENSEEIHT

Parallel session: Some applications of reinforcement learning to networks

Dr Konstantin Avrachenkov (INRIA Sophia Antipolis)

Restless Multi-Armed Bandits (RMABs) are extensively used in scheduling,
resource allocation, marketing and clinical trials, just to name a few
application areas. RMABs are Markov Decision Processes with two actions
(active and passive modes) for each arm and with a constraint on the
number of active arms per time slot. Since in general RMABs are
PSPACE-complete, several heuristics such as Whittle index and LP index
have been proposed. In this talk, I present a reinforcement learning
scheme for LP indices with almost sure convergence guarantee in the
tabular setting and an empirically efficient Deep Q-learning variant.
Several examples, including scheduling in queueing systems, will be
presented. This is a joint work V.S. Borkar and P. Shah from IIT Bombay.

Dr Konstantin Avrachenkov (INRIA Sophia Antipolis)

Prof. Vivek Borkar (IITB) Mr Pratik Shah (IITB)

There are no materials yet.

Reinforcement Learning for Stochastic Networks, Toulouse

Learning LP-indices in Average-Reward Restless Multi-Armed Bandits

A001

ENSEEIHT

Speaker

Description

Primary author

Co-authors

Presentation materials

Choose timezone

Reinforcement Learning for Stochastic Networks, Toulouse

Speaker

Description

Primary author

Co-authors

Presentation materials