Séminaire de Probabilités

From Euclidean percolation to distance learning: proposal and challenges

by Dr Matthieu Jonckheere (LAAS, Toulouse)

Amphi Schwartz

Amphi Schwartz

In unsupervised statistical learning tasks such as clustering, recommendation, or dimension reduction, a notion of distance or similarity between points is crucial but usually not directly available as an input. 
We proposed a new density-based estimator for weighted geodesic distances that takes into account the underlying density of the data, and that is suitable for nonuniform data lying on a manifold of lower dimension than the ambient space. 
The consistency of the estimator is proven using tools from first passage percolation.  We then illustrate its properties and implementation and evaluate its performance for clustering tasks. Finally, we discuss the choice of the (unique) critical parameter involved and related open problems. 

Joint work with P. Groisman, University of Buenos Aires and F. Sapienza, Berkeley.

On going work with F. Chazal, F. Pascal, L. Ferrarys, Paris Saclay