11e Journée Statistique et Informatique pour la Science des Données à Paris-Saclay
vendredi 3 avril 2026 -
09:30
lundi 30 mars 2026
mardi 31 mars 2026
mercredi 1 avril 2026
jeudi 2 avril 2026
vendredi 3 avril 2026
09:30
Café d'accueil
Café d'accueil
09:30 - 10:00
Room: Centre de Conférences Marilyn et James Simons
10:00
Linear Bandits on Ellipsoids: Minimax Optimal Algorithms
-
Richard Combes
(
CentraleSupélec
)
Linear Bandits on Ellipsoids: Minimax Optimal Algorithms
Richard Combes
(
CentraleSupélec
)
10:00 - 10:50
Room: Centre de Conférences Marilyn et James Simons
We consider linear stochastic bandits where the set of actions is an ellipsoid. We provide the first known minimax optimal algorithm for this problem. We first derive a novel information-theoretic lower bound on the regret of any algorithm, which must be at least $\Omega(\min(d \sigma \sqrt{T} + d \|\theta\|_{A}, \|\theta\|_{A} T))$ where $d$ is the dimension, $T$ the time horizon, $\sigma^2$ the noise variance, $A$ a matrix defining the set of actions, and $\theta$ the vector of unknown parameters. We then provide an algorithm whose regret matches this bound to a multiplicative universal constant. The algorithm is non-classical in the sense that it is not optimistic, and it is not a sampling algorithm. The main idea is to combine a novel sequential procedure to estimate $\|\theta\|$, followed by an explore-and-commit strategy informed by this estimate. The algorithm is highly computationally efficient, and a run requires only time $\mathcal{O}(dT + d^2 \log(T/d) + d^3)$ and memory $\mathcal{O}(d^2)$, in contrast with known optimistic algorithms, which are not implementable in polynomial time. We go beyond minimax optimality and show that our algorithm is locally asymptotically minimax optimal, a much stronger notion of optimality. We further provide numerical experiments to illustrate our theoretical findings.
10:50
Pause café
Pause café
10:50 - 11:20
Room: Centre de Conférences Marilyn et James Simons
11:20
Statistical Analysis of Multiple Networks
-
Tabea Rebafka
(
AgroParisTech
)
Statistical Analysis of Multiple Networks
Tabea Rebafka
(
AgroParisTech
)
11:20 - 12:10
Room: Centre de Conférences Marilyn et James Simons
This talk provides a brief introduction to statistical network analysis and random graph models. We then focus on the problem of estimating the graphon function, which characterizes nonparametric exchangeable random graph models. Our main emphasis is on the setting where multiple networks are observed, which introduces additional challenges compared to the classical single-network framework. To address this, we propose a new histogram-based estimator with low computational complexity. The key idea is to jointly align the nodes across all observed graphs, rather than processing each network independently as in most existing approaches. We establish consistency results for the proposed estimator and demonstrate through numerical experiments that it outperforms current methods in both estimation accuracy and computational efficiency. Finally, we show that, when used for data augmentation in graph neural network classification tasks, our approach leads to improved performance on various real-world datasets. This is joint work with Roland Sogan.
12:10
A Computable Measure of Suboptimality for Entropy-Regularised Variational Objectives
-
Anna Korba
(
ENSAE
)
A Computable Measure of Suboptimality for Entropy-Regularised Variational Objectives
Anna Korba
(
ENSAE
)
12:10 - 13:00
Room: Centre de Conférences Marilyn et James Simons
Several emerging post-Bayesian methods target a probability distribution for which an entropy-regularised variational objective is minimised. This increased flexibility introduces a computational challenge, as one loses access to an explicit unnormalised density for the target. To mitigate this difficulty, we introduce a novel measure of suboptimality called 'gradient discrepancy', and in particular a 'kernel' gradient discrepancy (KGD) that can be explicitly computed. In the standard Bayesian context, KGD coincides with the kernel Stein discrepancy (KSD), and we obtain a novel characterisation of KSD as measuring the size of a variational gradient. Outside this familiar setting, KGD enables novel sampling algorithms to be developed and compared, even when unnormalised densities cannot be obtained. To illustrate this point several novel algorithms are proposed and studied, including a natural generalisation of Stein variational gradient descent, with applications to mean-field neural networks and predictively oriented posteriors presented. On the theoretical side, our principal contribution is to establish sufficient conditions for desirable properties of KGD, such as continuity and convergence control. Joint work with Clémentine Chazal, Heishiro Kanagawa, Zheyang Shen and Chris. J. Oates.
13:00
Déjeuner Buffet
Déjeuner Buffet
13:00 - 14:30
Room: Centre de Conférences Marilyn et James Simons
14:30
Asymptotic Theory of Iterated Empirical Risk Minimization, with Applications to Active Learning
-
Hugo Cui
(
CNRS, Paris-Saclay
)
Asymptotic Theory of Iterated Empirical Risk Minimization, with Applications to Active Learning
Hugo Cui
(
CNRS, Paris-Saclay
)
14:30 - 15:20
Room: Centre de Conférences Marilyn et James Simons
We study a class of iterated empirical risk minimization (ERM) procedures in which two successive ERMs are performed on the same dataset, and the predictions of the first estimator enter as an argument in the loss function of the second. This setting, which arises naturally in active learning and reweighting schemes, introduces intricate statistical dependencies across samples and fundamentally distinguishes the problem from classical single-stage ERM analyses. For linear models trained with a broad class of convex losses on Gaussian mixture data, we derive a sharp asymptotic characterization of the test error in the high-dimensional regime where the sample size and ambient dimension scale proportionally. Our results provide explicit, fully asymptotic predictions for the performance of the second-stage estimator despite the reuse of data and the presence of prediction-dependent losses. We apply this theory to revisit a well-studied pool-based active learning problem, removing oracle and sample-splitting assumptions made in prior work. We uncover a fundamental tradeoff in how the labeling budget should be allocated across stages, and demonstrate a double-descent behavior of the test error driven purely by data selection, rather than model size or sample count. Based on joint work with Yue M Lu.
15:20
Convergence and Linear Speed-Up in Stochastic Federated Learning
-
Paul Mangold
(
École polytechnique
)
Convergence and Linear Speed-Up in Stochastic Federated Learning
Paul Mangold
(
École polytechnique
)
15:20 - 16:10
Room: Centre de Conférences Marilyn et James Simons
In federated learning, multiple users collaboratively train a machine learning model without sharing local data. To reduce communication, users perform multiple local stochastic gradient steps that are then aggregated by a central server. However, due to data heterogeneity, local training introduces bias. In this talk, I will present a novel interpretation of the Federated Averaging algorithm, establishing its convergence to a stationary distribution. By analyzing this distribution, we show that the bias consists of two components: one due to heterogeneity and another due to gradient stochasticity. I will then extend this analysis to the Scaffold algorithm, demonstrating that it effectively mitigates heterogeneity bias but not stochasticity bias. Finally, we show that both algorithms achieve linear speed-up in the number of agents, a key property in federated stochastic optimization.
16:10
Pause café
Pause café
16:10 - 16:40
Room: Centre de Conférences Marilyn et James Simons
16:40
Beyond Kemeny Medians: Consensus Ranking Distributions Definition, Properties and Statistical Learning
-
Ekhine Irurozki
(
Télécom paris
)
Beyond Kemeny Medians: Consensus Ranking Distributions Definition, Properties and Statistical Learning
Ekhine Irurozki
(
Télécom paris
)
16:40 - 17:30
Room: Centre de Conférences Marilyn et James Simons
Summarising a distribution over rankings by a single Kemeny median fails whenever the distribution is multimodal or heterogeneous. Drawing on the histogram analogy, we introduce Consensus Ranking Distributions (CRD): sparse mixtures of local Kemeny medians indexed by a partition of the space of rankings, interpolating between a single consensus ranking and the raw empirical distribution. We propose the COAST algorithm, a top-down decision tree that learns the partition from data using pairwise comparison splits, and establish a PAC-style generalisation bound. Experiments on synthetic mixtures and real preference data illustrate the method's ability to recover modes and produce interpretable summaries.