Journée Statistique et Informatique pour la Science des Données à Paris Saclay

Name: Journée Statistique et Informatique pour la Science des Données à Paris Saclay
Start: 2021-02-05T09:00:00+01:00
End: 2021-02-05T17:20:00+01:00
Location: Le Bois-Marie

vendredi 5 févr. 2021, 09:00 → 17:20 Europe/Paris

Le Bois-Marie

35, route de Chartres 91440 Bures-sur-Yvette

Description

The aim of this workshop is to bring together mathematicians and computer scientists around some talks on recent results from statistics, machine learning and more generally data science research. Various topics in machine learning, optimization, deep learning, optimal transport, inverse problems, statistics and problems of scientific reproducibility will be presented.

Registration is free and open to February 2, 2021.

Organised by: Thanh Mai PHAM NGOC (LMO) and Charles SOUSSEN (L2S)

Note: The conference will be held entirely on video-conference.

Invited speakers:

Guillaume Charpiat (LRI)
Lenaïc Chizat (LMO)
Emilie Chouzenoux (CVN)
Agnès Desolneux (Centre Borelli)
Gaël Richard (Télécom Paris)
Gaël Varoquaux (INRIA Parietal)

Participants

Aayadi Khadija
Abbass Gorgi
Abdelilah Monir
Aboubakar MAITOURNAM
Adeline Fermanian
Adnane Fouadi
Adrien Courtois
Ahmed Ben Saad
Ajmal Oodally
Alain GIBERT
Alessandro Leite
Alexandre Gramfort
Alexandre Hippert-Ferrer
Alexis BISMUTH
Amin Fehri
Anirudh Rayas
Anna Kazeykina
Antoine Collas
Aurélien Decelle
Aya Sakite
Beatriz Seoane
Benjamin Auder
Benjamin Guedj
Benoit CLAIR
Berrenur Saylam
Bertrand Maury
Bertrand Michel
Bousselham GANBOURI
caligaris claude
Canon Didier
Catherine MATIAS
Cedric Allain
Christian Derquenne
Christophe JUILLET
César CARDORELLE
Daniel Fiorilli
Daniel Wagner
David Vigouroux
Didier Lucor
Dieu merci Kimpolo nkokolo
djama abdi bachir
Désiré Sidibé
Elena MAJ
Elisabeth Lahalle
Elton Rexhepaj
Elvire Roblin
Emmanuel IDOHOU
Emmanuel Menier
Estelle Kuhn
Fanny Pouyet
Fedor Goncharov
Flora Jay
Florent Bouchard
Florian Gosselin
FRANCOIS BICHET
François Landes
François Orieux
Frédéric Barbaresco
Frédéric Pascal
Gabriele Facciolo
Gerard Kerkyacharian
Gilles Blanchard
Hui Yan
Huyen Nguyen
Héctor Climente
Ilias Ghrizi
Ines OUKID
IOANNIS BARGIOTAS
Ismaël Castillo
Jean Vidal
Jean-Armand Moroni
Jean-Loup Loyer
Jean-Loup Loyer
Jerome Buzzi
Johan Duque
Joon Kwon
Kai Zheng
Kaniav Kamary
Kare KAMILA
Khalid Akhlil
Laura Vuduc
Laurent Pierre
Lionel Mathelin
liu tupikina
Lolita Aboa
Lorenzo Audibert
Léon Faure
Malika Kharouf
Manon MOTTIER
Marc Evrard
Marc Glisse
Marc Michel
Marietta Manolessou
Mathilde Jeuland
Matthieu Nastorg
Michele Alessandro Bucci
Miha Srdinšek
Milad LEYLI ABADI
MOHAMED Alaoui
Mohammed Nabil EL KORSO
Myrto Limnios
Narcicegi Kiran
Natalia Rodriguez
Nicolas Lermé
Nilo Schwencke
Olivia Breysse
onofrio semeraro
Pablo Miralles
Pascal Bondon
Pegdwende Minoungou
Pierluigi Morra
Quang Huy Tran
Quentin Duchemin
Rahmani mostafa
Raphael LECLERCQ
RIFI Mouna
Ruocong Zhang
Ryad Belhakem
Rémi Ginestiere
Saad Balbiyad
Sabrine Bendimerad
Salvish Goomanee
Samy Clementz
sanaa zannane
Santosh Ballav Sapkota
Sara REJEB
Sebastien Treguer
Sebastien Treguer
SENA HERVE DAKO
Sohrab Samimi
Stefano Fortunati
Stephane RUBY
Sylvain Arlot
Taha Bouziane
Tamon Nakano
Thanh Mai PHAM NGOC
Theo Deladerriere
Thibaud Ishacian
Thibault Randrianarisoa
Théo Lacombe
Timothée Mathieu
Toumi Bouchentouf
Trésor Djonga
VAIBHAV ARORA
VIANNEY PERCHET
Victoria Bourgeais
Vivien Goepp
YAO SINAN
Yassine Mhiri
Zacharie Naulet
Zakia BENJELLOUN-TOUIMI
Zhangyun Tan
Zhen Xu

Cécile Gourgues

cecile@ihes.fr

0160926607

- 10:20 → 10:30
  
  Accueil 10m
- 10:30 → 11:10
  
  Supervised Learning with Missing Values¶ 40m
  
  Some data come with missing values. For instance, a survey’s participant may ignore some questions. There is an abundant statistical literature on this topic, establishing for instance how to fit model without biases due to the missingness, and imputation strategies to provide practical solutions to the analyst. In machine learning, to build models that minimize a prediction risk, most work default to these practices. As we will see, these different settings lead to different theoretical and practical solutions.
  
  I will outline some conditions under which machine-learning models yield the best-possible predictions in the presence of missing values. A striking result is that naive imputation strategies can be optimal, as the supervised-learning model does the hard work [1]. A challenge to fitting a machine-learning model is that there is a combinatorial explosion of possible missing-values patterns such that even when the output is a linear function of the fully-observed data, the optimal predictor is complex [2]. I will show how the same dedicated neural architecture can approximate well the optimal predictor for multiple missing-values mechanisms, including difficult missing-not-at-random settings [3].
  
  [1] Josse, J., Prost, N., Scornet, E., & Varoquaux, G. (2019). On the consistency of supervised learning with missing values. arXiv preprint arXiv:1902.06931.
  
  [2] Le Morvan, M., Prost, N., Josse, J., Scornet, E., & Varoquaux, G. (2020). Linear predictor on linearly-generated data with missing values: non consistency and solutions. AISTATS 2020.
  
  Orateur: Gaël Varoquaux (INRIA Parietal)
  
  Slides
  
  Vidéo
- 11:10 → 11:20
  
  Pause café 10m
- 11:20 → 12:00
  
  Analysis of Gradient Descent on Wide Two-Layer Neural Networks¶ 40m
  
  Artificial neural networks are a class of "prediction" functions parameterized by a large number of parameters -- called weights -- that are used in various machine learning tasks (classification, regression, etc). Given a learning task, the weights are adjusted via a gradient-based algorithm so that the corresponding predictor achieves a good performance on a given training set. In this talk, we propose an analysis of gradient descent on wide two-layer ReLU neural networks for supervised machine learning tasks, that leads to sharp characterizations of the learned predictor. The main idea is to study the dynamics when the width of the hidden layer goes to infinity, which is a Wasserstein gradient flow. While this dynamics evolves on a non-convex landscape, we show that its limit is a global minimizer if initialized properly. We also study the "implicit bias" of this algorithm when the objective is the unregularized logistic loss: among the many global minimizers, we show that it selects a specific one which is a max-margin classifier in a certain functional space. We finally discuss what these results tell us about the generalization performance and the adaptivity to low dimensional structures of neural networks. This is based on joint work with Francis Bach.
  
  Orateur: Lenaïc Chizat (LMO)
  
  Slides
  
  Vidéo
- 12:00 → 12:40
  
  Deep Unfolding of a Proximal Interior Point Method for Image Restoration¶ 40m
  
  Variational methods have started to be widely applied to ill-posed inverse problems since they have the ability to embed prior knowledge about the solution. However, the level of performance of these methods significantly depends on a set of parameters, which can be estimated through computationally expensive and time-consuming processes. In contrast, deep learning offers very generic and efficient architectures, at the expense of explainability, since it is often used as a black-box, without any fine control over its output. Deep unfolding provides a convenient approach to combine variational-based and deep learning approaches. Starting from a variational formulation for image restoration, we develop iRestNet [1], a neural network architecture obtained by unfolding an interior point proximal algorithm. Hard constraints, encoding desirable properties for the restored image, are incorporated into the network thanks to a logarithmic barrier, while the barrier parameter, the stepsize, and the penalization weight are learned by the network. We derive explicit expressions for the gradient of the proximity operator for various choices of constraints, which allows training iRestNet with gradient descent and backpropagation. In addition, we provide theoretical results regarding the stability of the network. Numerical experiments on image deblurring problems show that the proposed approach outperforms both state-of-the-art variational and machine learning methods in terms of image quality.
  
  [1] C. Bertocchi, E. Chouzenoux, M.-C. Corbineau, J.-C. Pesquet and M. Prato. Deep Unfolding of a Proximal Interior Point Method for Image Restoration. Inverse Problems, vol. 36, pp. 034005, 2020.
  
  Orateur: Emilie Chouzenoux (CVN)
  
  Slides
  
  Vidéo
- 12:40 → 14:00
  
  Lunch 1h 20m
- 14:00 → 14:40
  
  Maximum Entropy Distributions for Image Synthesis under Statistical Constraints¶ 40m
  
  The question of texture synthesis in image processing is a very challenging problem that can be stated as followed: given an exemplar image, sample a new image that has the same statistical features (empirical mean, empirical covariance, filter responses, neural network responses, etc.). Exponential models then naturally arise as distributions satisfying these constraints in expectation while being of maximum entropy. Now the parameters of these exponential models need to be estimated and samples have to be drawn. I will explain how these can be done simultaneously through the SOUL (Stochastic Optimization with Unadjusted Langevin) algorithm. This is based on a joint work with Valentin de Bortoli, Alain Durmus, Bruno Galerne and Arthur Leclaire.
  
  Orateur: Agnès Desolneux (Centre Borelli)
  
  Slides
  
  Vidéo
- 14:40 → 14:50
  
  Pause café 10m
- 14:50 → 15:30
  
  Input Similarity from the Neural Network Perspective¶ 40m
  
  Given a trained neural network, we aim at understanding how similar it considers any two samples. For this, we express a proper definition of similarity from the neural network perspective (i.e. we quantify how undissociable two inputs A and B are), by taking a machine learning viewpoint: how much a parameter variation designed to change the output for A would impact the output for B as well?
  
  We study the mathematical properties of this similarity measure, and show how to estimate sample density with it, in low complexity, enabling new types of statistical analysis for neural networks. We also propose to use it during training, to enforce that examples known to be similar should also be seen as similar by the network.
  
  We then study the self-denoising phenomenon encountered in regression tasks when training neural networks on datasets with noisy labels. We exhibit a multimodal image registration task where almost perfect accuracy is reached, far beyond label noise variance. Such an impressive self-denoising phenomenon can be explained as a noise averaging effect over the labels of similar examples. We analyze data by retrieving samples perceived as similar by the network, and are able to quantify the denoising effect without requiring true labels.
  
  Orateur: Guillaume Charpiat (LRI)
  
  Slides
  
  Vidéo
- 15:30 → 16:10
  
  Deep Neural Network for Audio and Music Transformations¶ 40m
  
  We will first discuss how deep learning techniques can be used for audio signals. To that aim, we will recall some of the important characteristics of an audio signal and review some of the main deep learning architectures and concepts used in audio signal analysis. We will then illustrate some of these concepts in more details with two applications, namely informed singing voice source separation and music style transfer.
  
  Orateur: Gaël Richard (Télécom Paris)
  
  Slides
  
  Vidéo