Choisissez le fuseau horaire
Le fuseau horaire de votre profil:
Training neural networks via stochastic gradient descent (SGD) amounts to solving a complex, non-convex optimization problem. Inspired by statistical mechanics, the mean-field approach provides a macroscopic description of the training dynamics, which can be formulated as a convex optimization problem. In this talk, I will explain how to rigorously derive the macroscopic (mean-field) description from the microscopic dynamics given by the SGD updates. More precisely, I will establish the mean-field limit (a law of large numbers) and study the fluctuations around this limit (a central limit theorem). If time allows, I will present similar results in the Bayesian framework.