Mathematics for and by Large Language Models -- 2025 Edition

Europe/Paris
Centre de Conférences Marilyn et James Simons (Le Bois-Marie)

Centre de Conférences Marilyn et James Simons

Le Bois-Marie

35, route de Chartres CS 40001 91893 Bures-sur-Yvette Cedex
Description

The goal of this conference is to advance the dialogue and interactions between the LLM community and the larger world of mathematics in order to further the mathematical understanding of LLMs and contribute to solving some of the outstanding problems in the new field of LLMs.

 

In particular we intend to investigate mathematical structures that can be used to understand LLMs in terms of what they implicitly learn and how. 

 

At the same time, in the opposite direction the use of LLMs in order to do mathematics will be investigated.

 

Registration is free and open until May 15, 2025.

Invited speakers:
Katie Collins (Cambridge University)
Yann Fleureau (Numina)
Fabian Glöckle (Ecole Nationale des Ponts et Chaussées & FAIR at Meta, Paris)
Javier Gómez-Serrano (Brown University)
Timothy Gowers (Collège de France)
Yann Ollivier (FAIR at Meta, Paris)
Simon Frieder (Oxford University)

Organizers: 
François Charton (Meta AI Research), Michael Douglas (Harvard University & IHES), Amaury Hayat (CERMICS) & Yiannis Vlassopoulos (Athena Research Center & IHES)

 

Cécile Gourgues
    • 09:00 09:30
      Café d'accueil 30m
    • 09:30 10:20
      Project Numina and AI for Theorem Proving 50m

      Project Numina develops open-source open-dataset AI for mathematics. Numina won the AIMO progress prize in 2024 and released NuminaMath, the largest open dataset of math problem and solutions. Numina released the KiminaProver last month, a SOTA formal math prover. We will present the past, present and future developments of Numina along with our AI for theorem proving approaches.

      Orateur: Yann Fleureau (Numina)
    • 10:20 11:10
      Deciding What Game to Play, What Mathematics Problem to Solve 50m

      Mathematicians have limited lifetimes. A hallmark of a good mathematician is not just solving new problems well, but deciding what problems are worth solving in the first place. How do we decide what problems are worth our energy? In this talk, I will present some of our recent work aimed at better understanding how people decide what problems are worthwhile. We focus first on characterizing and computationally modeling peoples’ judgments about games they have never played before, and compare these judgments to LLMs among other alternate models. We then extend this work to mathematics and present early results and thinking about modeling evaluations about whether mathematics problems are worth pursuing (in both “everyday” and expert mathematics contexts). We close by discussing some translational potential in benchmarking LLMs and informing the design of more human-compatible AI thought partners for mathematics.

      Orateur: Katie Collins (Cambridge University)
    • 11:10 11:40
      Pause Café 30m
    • 11:40 12:30
      A Roadmap for Math Copilots 50m

      Math Copilots are the envisioned successors to the current generation of LLMs that assist mathematicians in various mathematical activities - from devising proofs to finding counterexamples. In this talk, I will survey and categorize the existing landscape of AI-driven mathematical software tools and benchmarks (including the AIMO series of competitions) and show which benchmarks are missing towards pointing LLMs to hill-climb in the direction of Math Copilots.
      I will conclude with a call to participate in a new type of benchmark, the BeyondProof benchmark.

      Orateur: Simon Frieder (Oxford University)
    • 12:30 14:00
      Déjeuner-Buffet 1h 30m
    • 14:00 14:50
      Why are LLMs not Better at Finding Proofs? (visio conference) 50m

      TBA

      Orateur: Timothy Gowers (Collège de France)
    • 14:50 15:40
      Machine Learning in PDE: Discovering New, Unstable Solutions 50m

      In this talk I will explain several recent results combining machine learning techniques and more traditional mathematics. The overarching theme is the interplay between modern (ML) and classical methods in order to discover new solutions of certain PDE with low or very low numerical error. I will also outline how to turn the numerical approximate solutions into a rigorous proof via computer-assisted methods, leading in some cases to singularity formation, or to the existence of certain special solutions (e.g. traveling waves). In particular, unstable solutions are now amenable to be discovered. While the motivation is originally coming from fluid mechanics' equations such as Euler or Navier-Stokes some of the methods can be tuned to other PDE.

      Orateur: Javier Gómez-Serrano (Brown University)
    • 15:40 16:10
      Pause Café 30m
    • 16:10 17:00
      Universal Optimal Control, Reinforcement Learning, and Reaching Goals in LLMs 50m

      TBA

      Orateur: Yann Ollivier (FAIR at Meta, Paris)
    • 17:00 17:50
      Search, Reason or Recombine? Paradigms for Scaling Formal Proving 50m

      In the effort to scale test-time computation for language models on mathematical benchmarks, two prominent paradigms have emerged: large-scale search with reinforcement learning, exemplified by methods like AlphaProof, and long chain-of-thought reasoning with emergent self-verification, as seen in models like o1. For the future of reinforcement learning in formal theorem proving, this opens up a spectrum of hybrid methods. These range from line-level tree search with environment feedback to multi-turn iterative whole proof generation, with and without integrated informal reasoning, to hierarchical problem decompositions and recombination of partial proofs. I will explain these methods as inference methods and discuss the challenges faced when applying reinforcement learning to them.

      Orateur: Fabian Glöckle (Ecole Nationale des Ponts et Chaussées et FAIR at Meta, Paris)