XVII th Conference on Stochastic Programming

Name: XVII th Conference on Stochastic Programming
Start: 2025-07-28T00:00:00+02:00
End: 2025-08-01T18:00:00+02:00
Location: No location set

28 juillet 2025 à 1 août 2025

Fuseau horaire Europe/Paris

Global convergence of stochastic gradient bandits for any learning rates

Non programmé

30m

F206

Invited talk Machine learning ML

Jincheng Mei (Google DeepMind)

We provide a new understanding of the stochastic gradient bandit algorithm by showing that it converges to a globally optimal policy almost surely using any constant learning rate. This result demonstrates that the stochastic gradient algorithm continues to balance exploration and exploitation appropriately even in scenarios where standard smoothness and noise control assumptions break down. The proofs are based on novel findings about action sampling rates and the relationship between cumulative progress and noise, and extend the current understanding of how simple stochastic gradient methods behave in bandit settings.

Jincheng Mei (Google DeepMind)

Aucun document.

XVII th Conference on Stochastic Programming

Global convergence of stochastic gradient bandits for any learning rates

F206

Orateur

Description

Auteur

Documents de présentation

Choisissez le fuseau horaire

XVII th Conference on Stochastic Programming

Orateur

Description

Auteur

Documents de présentation