3rd Edition of Mathematics for and by Large Language Models

Europe/Paris
Centre de conférences Marilyn et James Simons (Le Bois-Marie)

Centre de conférences Marilyn et James Simons

Le Bois-Marie

35, route de Chartres CS 40001 91893 Bures-sur-Yvette Cedex
Description

The goal of this conference is to advance the dialogue and interactions between the LLM community and the larger world of mathematics in order to further the mathematical understanding of LLMs and contribute to solving some of the outstanding problems in the new field of LLMs.

 

In particular we intend to investigate mathematical structures that can be used to understand LLMs in terms of what they implicitly learn and how. 

 

At the same time, in the opposite direction the use of LLMs in order to do mathematics will be investigated.

 

Registration is free and open until May 20, 2026.

Invited speakers:
Quentin Berthet (Google DeepMind)
Edward Lockhart (Google DeepMind)
Gabriel Peyré (CNRS, DMA, École Normale Supérieure)
Yiannis Vlassopoulos (Athena Research Center & IHES)

Organizers: 
Michael Douglas (Harvard University & IHES), Amaury Hayat (CERMICS), Julio Parra-Martinez (IHES) and Yiannis Vlassopoulos (Athena Research Center & IHES)

The math and LLM day and the workshop on AI for the study of Amplitudes, are supported by the Google DeepMind AI for Math Initiative, and the IHES thanks Google DeepMind for their support.

 

Cécile Gourgues
    • 09:00 09:30
      Welcome coffee 30m
    • 09:30 10:30
      MIND: Monge Inception Distance for Generative Models Evaluation 1h

      We propose the Monge Inception Distance (MIND), a metric for evaluating generative models that addresses key limitations of the widely adopted Fréchet Inception Distance (FID). The MIND metric leverages the sliced Wasserstein distance to compare distributions by averaging one-dimensional optimal transport distances, efficiently computed via sorting. This approach circumvents the estimation of high-dimensional means and covariance matrices, which underlie FID's poor sample complexity and vulnerability to adversarial attacks. We empirically demonstrate three primary advantages: (i) it is more sample-efficient by one order of magnitude, (ii) it is faster to compute by two orders of magnitude, (iii) it is more robust to adversarial attacks such as moment-matching. We show that MIND with 5k samples can replace the evaluation performance of FID with 50k samples, providing high correlation with this standard benchmark and superior discriminative performance. We further demonstrate that even smaller sample sizes (e.g., 1k or 2k) remain highly informative for rapid model iteration.

      Joint work with Yu-Han Wu, Clément Crepy, Romuald Elie, Klaus Greff, Michael E. Sander

      Orateur: Quentin Berthet (Google DeepMind)
    • 10:30 11:30
      The Expressive Power of Large Language Models 1h

      Large language models process vast sequences of input tokens by alternating between classical multi-layer perceptron layers and self-attention mechanisms. While the approximation capabilities of perceptrons are relatively well understood, those of attention mechanisms remain less explored. In this talk, I will compare the proof techniques and approximation results associated with these two types of layers, emphasizing key open questions that connect large language models with approximation theory in infinite-dimensional spaces representing input token distributions.

      Orateur: Gabriel Peyré (CNRS, DMA, École Normale Supérieure)
    • 11:30 12:00
      Coffee break 30m
    • 12:00 13:00
      ReLU and Softplus Neural Nets as Zero-Sum, Turn-Based, Stopping Games 1h

      Neural networks are for the most part treated as black boxes.
      In an effort to understand the mathematical structure that underlies them we will explain how ReLU neural nets can be interpreted as zero-sum, turn-based, stopping games.

      The game runs in the opposite direction to the net. The input to the net is the terminal reward of the game, the output of every neuron turns out to be equal to the value of the game at a corresponding state. The weights are used to define state-transition probabilities and the biases to define rewards.
      Running the ReLU net becomes the same as running the Shapley-Bellman backwards recursion (which in this case is minimax dynamic programming) for the value of the game.

      As an application, we obtain bounds for the output of every neuron of the net, given bounds for the input to the net.

      Moreover, the game interpretation links the ReLU net with statistical mechanics, interpreting the output of every neuron as a discrete path integral.
      We will also explain consequences of the game point of view, to interpretability of the net considered as a classifier.

      Adding an entropic regularization to the ReLU net game, allows us to interpret Softplus neural nets as games in an analogous fashion.

      This is joint work with Stéphane Gaubert.

      Orateur: Yiannis Vlassopoulos (Athena Research Center & IHES)
    • 13:00 14:00
      Lunch - Buffet 1h
    • 14:00 15:00
      Why AI Needs Formal Mathematics 1h

      Current reinforcement learning methods train Large Language Models to generate outputs that satisfy an automated judge. While this drives impressive feats of reasoning, it inadvertently incentivises the superficial appearance of correctness. Models may learn to "reward hack" by glossing over logical flaws or confidently making false claims.
      In this talk, I will explore how some AI researchers are turning to formal verification to solve this illusion of competence. By pairing LLMs with proof assistants, we can shift AI training from adversarial reward-maximisation to a cooperative process where reward hacking becomes impossible. I will also examine the broader implications of this emerging capability, discussing how "formalisation on-demand" can serve as a substitute for human social credibility and lay the groundwork for fully autonomous AI mathematical research.

      Orateur: Edward Lockhart (Google DeepMind)