Speaker
Nicolas Gast
(Inria, Univ. Grenoble Alpes)
Description
Stochastic approximation algorithms are quite popular in reinforcement learning notably because they are powerful tools to study the convergence of algorithms based on stochastic gradient descent (like Q-learning of policy gradient). In this talk, I will focus on constant step-size stochastic approximation and present tools to compute its asymptotic bias, which is non-zero (both for Martingale noise or Markovian noise), contrary to the case of decreasing step-size. The analysis is based on a fine comparison of the generators of the stochastic system and its deterministic counterpart. It is similar to Stein's method.
Primary author
Nicolas Gast
(Inria, Univ. Grenoble Alpes)
Co-author
Sebastian Allmeier
(Inria, Univ. Grenoble Alpes)