Modelling, Estimating and Sampling Multiscale Maximum Entropy Distributions in High Dimensions
par
Salle J. Cavailles
1R2-132
Modelling, estimating and sampling non-Gaussian probability distributions in high dimension from limited data lies at the heart of statistical physics and machine learning. Such distributions describe processes as varied as ocean currents, flocks of birds, or natural images. Classical approaches rely on models that are typically maximum-entropy distributions with few parameters, limited in expressivity and estimated by algorithms that do not scale to high dimensions. Recent machine-learning methods, by contrast, sample efficiently from high-dimensional, multimodal distributions by transporting noise onto data—but rely on neural networks with billions of parameters, limiting interpretability and requiring hundreds of thousands of training samples.
In this presentation, we develop algorithms to estimate and sample high-dimensional maximum-entropy distributions—non-Gaussian, multiscale and multimodal—from few realisations. The central challenge is the curse of dimensionality, which afflicts modelling, estimation and sampling alike because non-Gaussian processes in physics and nature exhibit long-range dependencies and multimodal distributions. On one hand, a hierarchical factorisation of the distribution into conditional probabilities across scales in a wavelet basis disentangles long-range dependencies. On the other hand, multimodality can be adressed with Moment-Guided Diffusion (MGD), an algorithm that transports noise onto the target maximum-entropy distribution. Both strategies define a mathematically guaranteed framework to model with few parameters, estimate from scarse datasets and sample efficiently multiscale maximum entropy distributions in high dimensions, with applications to physics and finance.