Orateur
Borjan Geshkovski
Description
With remarkable empirical success, Transformers enable large language models to compute succinct representations of data using the self-attention mechanism. We model these architectures as interacting particle systems in the spirit of models in collective behaviour and opinion dynamics, allowing us to show the appearance of various clustering/coagulation phenomena. Associated control problems will also be discussed. Based on joint work with Cyril Letrouit, Yury Polyanskiy, and Philippe Rigollet.