Reinforcement Learning for Stochastic Networks, Toulouse

Name: Reinforcement Learning for Stochastic Networks, Toulouse
Start: 2024-06-17T09:00:00+02:00
End: 2024-06-21T18:00:00+02:00
Location: ENSEEIHT

Jun 17 – 21, 2024

ENSEEIHT

Europe/Paris timezone

Session

Poster session

Jun 18, 2024, 3:00 PM

Hall of building C (ENSEEIHT)

Hall of building C

ENSEEIHT

There are no materials yet.

87. Boosting Rare Event Simulation in Markov Processes

Ernesto Garcia (LAAS)

6/18/24, 3:00 PM

Under constraints on the total simulation time available for a Markov process, we look for regimes where parallel independent simulations can effectively sample unlikely regions of the state space.
Go to contribution page
82. Dynamic Scheduling and Trajectory Planning for Urban Intersections with Heterogeneous Autonomous Traffic

Purva Joshi (TU/e)

6/18/24, 3:00 PM

The anticipated launch of fully autonomous vehicles presents an opportunity to develop and implement novel traffic management systems, such as for urban intersections. Platoon-forming algorithms, in which vehicles are grouped together with short inter-vehicular distances just before arriving at an intersection at high speed, seem promising from a capacity-improving standpoint. In this work, we...
Go to contribution page
84. Learning payoffs while routing in skill-based queues

Sanne van Kempen (TU/e)

6/18/24, 3:00 PM

We consider skill-based routing in queueing systems with heterogeneous customers and servers, where the quality of service is measured by customer-server dependent random rewards and the reward structure is a priori unknown to the system operator. We analyze routing policies that simultaneously learn the system pa- rameters and optimize the reward accumulation, while satisfying queueing...
Go to contribution page
86. Non-preemptive scheduling with non-observable environment

Thomas Hira (IRIT)

6/18/24, 3:00 PM

We investigate a non-preemptive scheduling problem within a class of non-observable environments, framed as a restless multi-armed bandit (RMAB) problem characterized by a Markovian dynamics and partial observability. Each arms of this RMAB is modeled as independent Gilbert-Elliot channels with different parameters and the current state of each arms is not observable by the decision-maker so...
Go to contribution page
83. Reinforcement learning and regret bounds for admission control

Lucas Weber (Inria)

6/18/24, 3:00 PM

The expected regret of any reinforcement learning algorithm is lower bounded by $\Omega\left(\sqrt{DXAT}\right)$ for undiscounted returns, where $D$ is the diameter of the Markov decision process, $X$ the size of the state space, $A$ the size of the action space and $T$ the number of time steps. However, this lower bound is general. A smaller regret can be obtained by taking into account some...
Go to contribution page
85. Time-Constrained Robust MDPs

Adil Zouitine (SUPAERO)

6/18/24, 3:00 PM

Robust reinforcement learning is essential for deploying reinforcement learning algorithms in real-world scenarios where environmental uncertainty predominates.
Traditional robust reinforcement learning often depends on rectangularity assumptions, where adverse probability measures of outcome states are assumed to be independent across different states and actions.
This assumption, rarely...
Go to contribution page

Building timetable...

Choose timezone

Reinforcement Learning for Stochastic Networks, Toulouse

Presentation materials