This will be a discussion about large language models such as OpenAI’s GPT series, oriented towards physicists and mathematicians. After a brief survey of the state of the art, we describe transformer models in detail, and discuss current ideas on how they work and how models trained to predict the next word in a text are able to perform other tasks displaying intelligence.
Emmanuel Ullmo