Orateur
Description
Many approaches to optimization without derivatives rooted in probability theory are variants of stochastic approximation such as the well-known Kiefer-Wolfowitz method, a finite-difference stochastic approximation (FDSA) algorithm that estimates gradients using finite differences. Such methods are known to converge slowly: in many cases the best possible convergence rate is governed by the Central Limit Theorem leading to a mean square error that vanishes at rate inversely proportional to the number of iterations.
In this talk, I will show that those slow convergence rates are not a forgone conclusion for stochastic algorithms without derivatives. I will present a class of adaptive stochastic algorithms originating from the class of Evolution Strategy algorithms, where we can prove asymptotic geometric convergence of the mean square error on classes of functions that include non-convex and non-quasi convex functions. This corresponds to linear convergence in optimization. I will highlight the main differences compared to FDSA algorithms and explain how the analysis of the stability of underlying Markov chain allow enables linear convergence guarantees.
I will discuss the connection to the analysis of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), widely regarded as one of the most effective stochastic algorithms for solving complex derivative-free optimization problems.