Jun 17 – 21, 2024
ENSEEIHT
Europe/Paris timezone

Session

Parallel session: Reinforcement learning in MDPs with large state spaces

Jun 18, 2024, 1:30 PM
A001 (ENSEEIHT)

A001

ENSEEIHT

Description

Organizers and chairs: R. Srikant and Yashaswini Murthy

Presentation materials

There are no materials yet.

  1. Lei Ying (University of Michigan, Ann Arbor)
    6/18/24, 1:30 PM

    This talk presents our recent results on joint learning and scheduling in queueing systems.

    Go to contribution page
  2. Matias Alvo (Columbia Business School (Decision, Risk and Operations division))
    6/18/24, 2:00 PM

    Inventory management offers unique opportunities for reliably evaluating and applying deep reinforcement learning (DRL). We introduce Hindsight Differentiable Policy Optimization (HDPO), facilitating direct optimization of a policy's hindsight performance using stochastic gradient descent. HDPO leverages two key elements: (i) an ability to backtest any policy's performance on a sample of ...

    Go to contribution page
  3. Assaf Zeevi (columbia university)
    6/18/24, 2:30 PM

    We propose a new regret minimization algorithm for episodic sparse linear Markov decision process (SMDP) where the state-transition distribution is a linear function of observed features.
    The only previously known algorithm for SMDP requires the knowledge of the sparsity parameter and oracle access to a reference policy.
    We overcome these limitations by combining the doubly robust method...

    Go to contribution page
Building timetable...