Séminaire de Probabilités commun ICJ/UMPA
Mean-field analysis of the training dynamics of two-layer neural networks
par
→
Europe/Paris
Fokko du Cloux (ICJ)
Fokko du Cloux
ICJ
Description
Training neural networks via stochastic gradient descent (SGD) amounts to solving a complex, non-convex optimization problem. Inspired by statistical mechanics, the mean-field approach provides a macroscopic description of the training dynamics, which can be formulated as a convex optimization problem. In this talk, I will explain how to rigorously derive the macroscopic (mean-field) description from the microscopic dynamics given by the SGD updates. More precisely, I will establish the mean-field limit (a law of large numbers) and study the fluctuations around this limit (a central limit theorem). If time allows, I will present similar results in the Bayesian framework.