-
Ernesto Garcia (LAAS)6/18/24, 3:00 PM
Under constraints on the total simulation time available for a Markov process, we look for regimes where parallel independent simulations can effectively sample unlikely regions of the state space.
Go to contribution page -
Purva Joshi (TU/e)6/18/24, 3:00 PM
The anticipated launch of fully autonomous vehicles presents an opportunity to develop and implement novel traffic management systems, such as for urban intersections. Platoon-forming algorithms, in which vehicles are grouped together with short inter-vehicular distances just before arriving at an intersection at high speed, seem promising from a capacity-improving standpoint. In this work, we...
Go to contribution page -
Sanne van Kempen (TU/e)6/18/24, 3:00 PM
We consider skill-based routing in queueing systems with heterogeneous customers and servers, where the quality of service is measured by customer-server dependent random rewards and the reward structure is a priori unknown to the system operator. We analyze routing policies that simultaneously learn the system pa- rameters and optimize the reward accumulation, while satisfying queueing...
Go to contribution page -
Thomas Hira (IRIT)6/18/24, 3:00 PM
We investigate a non-preemptive scheduling problem within a class of non-observable environments, framed as a restless multi-armed bandit (RMAB) problem characterized by a Markovian dynamics and partial observability. Each arms of this RMAB is modeled as independent Gilbert-Elliot channels with different parameters and the current state of each arms is not observable by the decision-maker so...
Go to contribution page -
Lucas Weber (Inria)6/18/24, 3:00 PM
The expected regret of any reinforcement learning algorithm is lower bounded by $\Omega\left(\sqrt{DXAT}\right)$ for undiscounted returns, where $D$ is the diameter of the Markov decision process, $X$ the size of the state space, $A$ the size of the action space and $T$ the number of time steps. However, this lower bound is general. A smaller regret can be obtained by taking into account some...
Go to contribution page -
Adil Zouitine (SUPAERO)6/18/24, 3:00 PM
Robust reinforcement learning is essential for deploying reinforcement learning algorithms in real-world scenarios where environmental uncertainty predominates.
Go to contribution page
Traditional robust reinforcement learning often depends on rectangularity assumptions, where adverse probability measures of outcome states are assumed to be independent across different states and actions.
This assumption, rarely...
Choose timezone
Your profile timezone: