3rd Edition of Mathematics for and by Large Language Models
jeudi 28 mai 2026 -
09:00
lundi 25 mai 2026
mardi 26 mai 2026
mercredi 27 mai 2026
jeudi 28 mai 2026
09:00
Welcome coffee
Welcome coffee
09:00 - 09:30
Room: Centre de conférences Marilyn et James Simons
09:30
TBA
-
Quentin Berthet
(
Google DeepMind
)
TBA
Quentin Berthet
(
Google DeepMind
)
09:30 - 10:30
Room: Centre de conférences Marilyn et James Simons
TBA
10:30
The Expressive Power of Large Language Models
-
Gabriel Peyré
(
CNRS, DMA, École Normale Supérieure
)
The Expressive Power of Large Language Models
Gabriel Peyré
(
CNRS, DMA, École Normale Supérieure
)
10:30 - 11:30
Room: Centre de conférences Marilyn et James Simons
Large language models process vast sequences of input tokens by alternating between classical multi-layer perceptron layers and self-attention mechanisms. While the approximation capabilities of perceptrons are relatively well understood, those of attention mechanisms remain less explored. In this talk, I will compare the proof techniques and approximation results associated with these two types of layers, emphasizing key open questions that connect large language models with approximation theory in infinite-dimensional spaces representing input token distributions.
11:30
Coffee break
Coffee break
11:30 - 12:00
Room: Centre de conférences Marilyn et James Simons
12:00
ReLU and Softplus Neural Nets as Zero-Sum, Turn-Based, Stopping Games
-
Yiannis Vlassopoulos
(
Athena Research Center & IHES
)
ReLU and Softplus Neural Nets as Zero-Sum, Turn-Based, Stopping Games
Yiannis Vlassopoulos
(
Athena Research Center & IHES
)
12:00 - 13:00
Room: Centre de conférences Marilyn et James Simons
Neural networks are for the most part treated as black boxes. In an effort to understand the mathematical structure that underlies them we will explain how ReLU neural nets can be interpreted as zero-sum, turn-based, stopping games. The game runs in the opposite direction to the net. The input to the net is the terminal reward of the game, the output of every neuron turns out to be equal to the value of the game at a corresponding state. The weights are used to define state-transition probabilities and the biases to define rewards. Running the ReLU net becomes the same as running the Shapley-Bellman backwards recursion (which in this case is minimax dynamic programming) for the value of the game. As an application, we obtain bounds for the output of every neuron of the net, given bounds for the input to the net. Moreover, the game interpretation links the ReLU net with statistical mechanics, interpreting the output of every neuron as a discrete path integral. We will also explain consequences of the game point of view, to interpretability of the net considered as a classifier. Adding an entropic regularization to the ReLU net game, allows us to interpret Softplus neural nets as games in an analogous fashion. This is joint work with Stéphane Gaubert.
13:00
Lunch - Buffet
Lunch - Buffet
13:00 - 14:00
Room: Centre de conférences Marilyn et James Simons
14:00
TBA
-
Edward Lockhart
(
Google DeepMind
)
TBA
Edward Lockhart
(
Google DeepMind
)
14:00 - 15:00
Room: Centre de conférences Marilyn et James Simons
TBA