Séminaire de Probabilités commun ICJ/UMPA

Mean-field analysis of the training dynamics of two-layer neural networks

par Arnaud Descours (ISFA)

Europe/Paris
Fokko du Cloux (ICJ)

Fokko du Cloux

ICJ

Description

Training neural networks via stochastic gradient descent (SGD) amounts to solving a complex, non-convex optimization problem. Inspired by statistical mechanics, the mean-field approach provides a macroscopic description of the training dynamics, which can be formulated as a convex optimization problem. In this talk, I will explain how to rigorously derive the macroscopic (mean-field) description from the microscopic dynamics given by the SGD updates. More precisely, I will establish the mean-field limit (a law of large numbers) and study the fluctuations around this limit (a central limit theorem). If time allows, I will present similar results in the Bayesian framework.