10e Journée Statistique et Informatique pour la Science des Données à Paris-Saclay

Europe/Paris
Centre de Conférences Marilyn et James Simons (Le Bois-Marie)

Centre de Conférences Marilyn et James Simons

Le Bois-Marie

35, route de Chartres CS 40001 91893 Bures-sur-Yvette Cedex
Description

The aim of this workshop is to bring together mathematicians and computer scientists around some talks on recent results from statistics, machine learning, and more generally data science research. Various topics in machine learning, optimization, deep learning, optimal transport, fairness, statistics will be presented. This workshop is particularly intended for doctoral and post-doctoral researchers.

 

Registration is free and open until March 25, 2025.

Invited speakers:
Anne Auger (Inria Saclay)
Etienne Boursier (Inria, Université Paris-Saclay)
Solenne Gaucher (École polytechnique)
Charlotte Laclau (Télécom Paris)
Arshak Minasyan (CentraleSupélec)
Nicolas Vayatis (ENS Paris-Saclay)

Organizers: 
Evgenii Chzhen (Laboratoire de Mathématiques d'Orsay, Université Paris-Saclay)
Erwan Le Pennec (École polytechnique)

Cécile Gourgues
    • 09:30 10:00
      Café d'accueil 30m
    • 10:00 10:50
      Topics in Algorithmic Fairness 50m

      Artificial intelligence (AI) is increasingly shaping the decisions that affect our lives—from hiring and education to healthcare and access to social services. While AI promises efficiency and objectivity, it also carries the risk of perpetuating and even amplifying societal biases embedded in the data used to train these systems. Many real-world examples highlight the dangers of relying on automated decision-making, as these algorithms can reinforce existing inequalities and discrimination.

      In this talk, I will explore some challenges of algorithmic fairness. I will begin by discussing the origins of bias in algorithms, as well as their impact on machine learning algorithms. I will then present some of the main approaches used to define, study and enforce algorithmic fairness. In a second part of the talk, I will focus more specifically on the statistical fairness framework, and present some classical results regarding the problem of fair regression. Specifically, I will focus on the criterion of Demographic Parity and examine the relationship between optimal predictions in the contexts of fair classification and fair regression.

      Orateur: Solenne Gaucher (École polytechnique)
    • 10:50 11:20
      Pause café 30m
    • 11:20 12:10
      Deep Out-of-the-distribution Uncertainty Quantification (in) for Data (Science) Scientists 50m

      In this talk, we present a practical solution to the lack of prediction diversity observed recently for deep learning approaches when used out-of-distribution. Considering that this issue is mainly related to a lack of weight diversity, we introduce the maximum entropy principle for the weight distribution coupled with the standard, task-dependent, in-distribution data fitting term. We prove numerically that the derived algorithm is systematically relevant. We also plan to us this strategy to make out-of-distribution predictions about the future of data (science) scientists.

      Orateur: Nicolas Vayatis (ENS Paris-Saclay)
    • 12:10 13:00
      Slow Convergence of Stochastic Optimization Algorithms Without Derivatives Is Avoidable 50m

      Many approaches to optimization without derivatives rooted in probability theory are variants of stochastic approximation such as the well-known Kiefer-Wolfowitz method, a finite-difference stochastic approximation (FDSA) algorithm that estimates gradients using finite differences. Such methods are known to converge slowly: in many cases the best possible convergence rate is governed by the Central Limit Theorem leading to a mean square error that vanishes at rate inversely proportional to the number of iterations.

      In this talk, I will show that those slow convergence rates are not a forgone conclusion for stochastic algorithms without derivatives. I will present a class of adaptive stochastic algorithms originating from the class of Evolution Strategy algorithms, where we can prove asymptotic geometric convergence of the mean square error on classes of functions that include non-convex and non-quasi convex functions. This corresponds to linear convergence in optimization. I will highlight the main differences compared to FDSA algorithms and explain how the analysis of the stability of underlying Markov chain allow enables linear convergence guarantees.

      I will discuss the connection to the analysis of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), widely regarded as one of the most effective stochastic algorithms for solving complex derivative-free optimization problems.

      Orateur: Anne Auger (Inria Saclay)
    • 13:00 14:30
      Déjeuner Buffet 1h 30m
    • 14:30 15:20
      Fair Classifiers via Transferable Representations 50m

      Group fairness is a central research topic in text classification, where reaching fair treatment between sensitive groups (e.g., women and men) remains an open challenge. In this talk, I will present an approach that extends the use of the Wasserstein Independence measure for learning unbiased neural text classifiers. Given the challenge of distinguishing fair from unfair information in a text encoder, we draw inspiration from adversarial training by inducing Wasserstein independence between representations learned for the target label and those for a sensitive attribute. We further show that domain adaptation can be efficiently leveraged to remove the need for access to the sensitive attributes in the dataset at training time. I will present theoretical and empirical evidence of the validity of this approach.

      Orateur: Charlotte Laclau (Télécom Paris)
    • 15:20 16:10
      Optimal Rates of Exact Recovery of the Matching Map 50m

      In this talk, we consider the problem of estimating the matching map between two sequences of d-dimensional noisy observations of feature vectors, possibly of different sizes (nm). We begin with the simplest case of permutation estimation and then extend it to the more general setting of estimating a matching map of unknown size k<min(n,m).
      Our main result shows that, in the high-dimensional setting, if the signal-to-noise ratio of the feature vectors is at least of order d1/4, then the true matching map can be recovered exactly (without errors) with high probability. We also establish a corresponding lower bound, proving the optimality of this rate. This rate is achieved using an estimated matching, defined as the minimizer of the sum of squared distances between matched pairs of points. Since the number of matching pairs is unknown, we first estimate the parameter k. We then show that the resulting optimization problem can be formulated as a minimum-cost flow problem and solved efficiently, with complexity O~(kn2).
      Finally, we present numerical experiments on both synthetic and real-world data, illustrating our theoretical results and providing further insight into the properties of the estimators and algorithms studied in this work. \textit{Joint work with T. Galstyan, S. Hunanyan, and A. Dalalyan.}

      Orateur: Arshak Minasyan (CentraleSupélec)
    • 16:10 16:40
      Pause café 30m
    • 16:40 17:30
      Training Overparametrized Neural Networks: Early Alignment Phenomenon and Simplicity Bias 50m

      The training of neural networks with first order methods still remains misunderstood in theory, despite compelling empirical evidence. Not only it is believed that neural networks converge towards global minimizers, but the implicit bias of optimisation algorithms makes them converge towards specific minimisers with nice generalisation properties. This talk focuses on the early alignment phase that appears in the training dynamics of two layer networks with small initialisations. During this early alignment phase, the numerous neurons align towards a few number of key directions, hence leading to some sparsity in the number of represented neurons. While this alignment phenomenon can be at the origin of convergence towards spurious local minima of the network parameters, such local minima can actually have good properties and yield much lower excess risks than any global minimizer of the training loss. In other words, this early alignment can lead to a simplicity bias that is helpful in minimizing the test loss.

      Orateur: Etienne Boursier (Inria, Université Paris-Saclay)