23 mai 2024
Le Bois-Marie
Fuseau horaire Europe/Paris

Liste des Contributions

6 sur 6 affichés
Exporter en PDF
  1. Amaury Hayat (École des Ponts ParisTech & CERMICS)
    23/05/2024 09:30

    Large Language models have known large successes in recent years. This naturally raises the question: can AI assist mathematicians in solving open problems in mathematics? We will explore how a language model can be trained to learn a mathematical intuition on open problems and guess candidate solutions, with a focus on a few examples. We will also explore the application of LLM to automated...

    Aller à la page de la contribution
  2. Julia Kempe (NYU Center for Data Science and Courant Institute of Mathematical Sciences)
    23/05/2024 10:30

    As AI and LLM model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data.
    In this talk...

    Aller à la page de la contribution
  3. Yiannis Vlassopoulos (Athena Research Center & IHES)
    23/05/2024 12:00

    Large Language Models are transformer neural networks which are trained to produce a probability distribution on the possible next words to given texts in a corpus, in such a way that the most likely word predicted, is the actual word in the training text.

    We will explain what is the mathematical structure defined by such conditional probability distributions of text extensions. Changing...

    Aller à la page de la contribution
  4. François Charton (Meta AI Research)
    23/05/2024 14:30

    Many problems of mathematics can be set as translation tasks: problems, represented as sentences in some language, are translated into their solutions, by language models trained from synthetic examples. In this setting, we can choose the distribution of problems and solutions we use to train the model. I present examples from three different experiments, which suggest that this can make a...

    Aller à la page de la contribution
  5. Andrew Dudzik (Google DeepMind)
    23/05/2024 15:30

    Neural networks, particularly LLMs, are notoriously poor at algorithmic tasks, such as sorting, shortest path, and even basic arithmetic. Across three papers, we explored the problem of "aligning" architectures to classical computer programs, and showed that this question relates to familiar mathematical concepts: polynomial functors, cohomology, and higher categories.

    Aller à la page de la contribution
  6. Gabriel Synnaeve (Meta AI Research)
    23/05/2024 17:00

    Large language models (LLMs) are trained in a very simple way. Lots of properties we assign to them are already present in the training data. In this talk we will review how LLMs are trained today, what are new training paradigms that are aiming at grounding those LLMs in the impact of those generations. In the context of code generation, this is for instance groudning the LLM with the...

    Aller à la page de la contribution