Fluctuations and concentration in two-layer neural networks

26 févr. 2026, 14:40
20m
Amphi 2 (Pôle Commun)

Amphi 2

Pôle Commun

Université Clermont Auvergne Campus des Cézeaux, 63170 Aubière

Description

We study the learning dynamics of wide two-layer neural networks trained by stochastic gradient descent (SGD), aiming to understand quantitatively how network width shapes both the typical training trajectory and the variability of the final predictor.

We adopt an interacting particle viewpoint in which neurons evolve under SGD as a large coupled system. As the width grows, this collective dynamics is well approximated by a deterministic mean-field limit, which provides an analytically tractable description of how the parameter distribution (and hence predictions) evolves during training.

We then quantify finite-width effects through two complementary results. First, we characterize fluctuations around the mean-field limit: after the natural rescaling, we show that the deviations converge to a Gaussian limiting process, yielding an explicit description of the variability induced by training randomness. Second, we establish finite-width concentration inequalities, uniform over training time, which control with high probability how close a width-N network remains to its mean-field proxy.

Auteurs

Dr Arnaud Descours (ISFA) Prof. Arnaud Guillin (Université Clermont-Auvergne) Boris Nectoux (LMBP - Clermont Fd) Dr Geoffrey Lacour (INRAE Paris-Saclay) Manon MICHEL (Laboratoire de Mathématiques Blaise Pascal - Université Clermont-Auvergne) Paul Stos (Université Clermont-Auvergne)

Documents de présentation