28 juillet 2025 à 1 août 2025
Fuseau horaire Europe/Paris

Structured Reinforcement Learning

1 août 2025, 14:00
30m
F206

F206

Invited talk Structured learning and stochastic combinatorial optimization: methodological perspectives and applications ML

Orateur

Heiko Hoppe (Technical University of Munich)

Description

When facing contextual multi-stage optimization problems, training combinatorial optimization-enriched machine learning pipelines (ML-CO-pipelines) to date either requires imitating expert solutions or utilizing unstructured learning algorithms. While the former restricts the use of ML-CO-pipelines to problems with traceable offline solutions and relatively homogenous state spaces, the latter fails to account for combinatorial action spaces, which can destabilize training.
To mitigate the respective drawbacks, we introduce structured reinforcement learning (SRL), enabling the stable training of ML-CO pipelines using only collected experience. In the core of its training process, SRL generates and evaluates several actions by perturbing the ML model’s output and subsequently performs an update step towards the best actions using a Fenchel-Young loss.
We test SRL in three static and three dynamic environments, representing various industrial applications. We find SRL to substantially outperform structured imitation learning and unstructured RL, attributing to its enhanced exploration of the combinatorial state-action space and to its improved training stability.

Authors

Heiko Hoppe (Technical University of Munich) Léo Baty (Ecole des Ponts) M. Louis Bouvier (Ecole des Ponts) Axel Parmentier Maximilian Schiffer (TU Munich)

Documents de présentation