Large Language models have known large successes in recent years. This naturally raises the question: can AI assist mathematicians in solving open problems in mathematics? We will explore how a language model can be trained to learn a mathematical intuition on open problems and guess candidate solutions, with a focus on a few examples. We will also explore the application of LLM to automated...
As AI and LLM model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data.
In this talk...
Large Language Models are transformer neural networks which are trained to produce a probability distribution on the possible next words to given texts in a corpus, in such a way that the most likely word predicted, is the actual word in the training text.
We will explain what is the mathematical structure defined by such conditional probability distributions of text extensions. Changing...
Many problems of mathematics can be set as translation tasks: problems, represented as sentences in some language, are translated into their solutions, by language models trained from synthetic examples. In this setting, we can choose the distribution of problems and solutions we use to train the model. I present examples from three different experiments, which suggest that this can make a...
Neural networks, particularly LLMs, are notoriously poor at algorithmic tasks, such as sorting, shortest path, and even basic arithmetic. Across three papers, we explored the problem of "aligning" architectures to classical computer programs, and showed that this question relates to familiar mathematical concepts: polynomial functors, cohomology, and higher categories.
Large language models (LLMs) are trained in a very simple way. Lots of properties we assign to them are already present in the training data. In this talk we will review how LLMs are trained today, what are new training paradigms that are aiming at grounding those LLMs in the impact of those generations. In the context of code generation, this is for instance groudning the LLM with the...