Reinforcement Learning for Stochastic Networks, Toulouse

Name: Reinforcement Learning for Stochastic Networks, Toulouse
Start: 2024-06-17T09:00:00+02:00
End: 2024-06-21T18:00:00+02:00
Location: ENSEEIHT

Jun 17 – 21, 2024

ENSEEIHT

Europe/Paris timezone

Learning payoffs while routing in skill-based queues

Jun 20, 2024, 4:00 PM

30m

A001 (ENSEEIHT)

A001

ENSEEIHT

Parallel session: Reinforcement learning and queueing II

Sanne van Kempen (Eindhoven University of Technology)

We consider skill based routing in queueing networks with heterogeneous customers and servers, where the quality of service is measured by customer-server dependent random rewards and the reward structure is a priori unknown to the system operator. We analyze routing policies that simultaneously learn the system parameters and optimize the reward accumulation, while satisfying queueing stability constraints. To this end, we introduce a model that integrates queueing dynamics and decision making. We use learning techniques from the multi-armed bandit (MAB) framework to propose a definition of regret against a suitable oracle reward and formulate an instance-dependent asymptotic regret lower bound. Since our lower bound is of the same order as results in the classical MAB setting, an asymptotically optimal learning algorithm must exploit the structure of the queueing system to learn as efficiently as in the classical setting, where decisions are not constrained by state space dynamics. We discuss approaches to overcome this by leveraging the analysis of the transient behavior of the queueing system.

Dr Fiona Sloothaak (Eindhoven University of Technology) Dr Jaron Sanders (Eindhoven University of Technology) Sanne van Kempen (Eindhoven University of Technology)

There are no materials yet.

Reinforcement Learning for Stochastic Networks, Toulouse

Learning payoffs while routing in skill-based queues

A001

ENSEEIHT

Speaker

Description

Primary authors

Presentation materials

Choose timezone

Reinforcement Learning for Stochastic Networks, Toulouse

Speaker

Description

Primary authors

Presentation materials