Big Data: Modeling, Estimation and Selection

Name: Big Data: Modeling, Estimation and Selection
Start: 2016-06-09T11:35:00+02:00
End: 2016-06-10T17:20:00+02:00
Location: Ecole Centrale Lille

9–10 juin 2016

Ecole Centrale Lille

Fuseau horaire Europe/Paris

Contact

High-dimensional data classification with mixtures of sphere-hardening distances

10 juin 2016, 10:50

45m

Grand Amphithéâtre (Ecole Centrale Lille)

Grand Amphithéâtre

Ecole Centrale Lille

Campus Lille 1 à Villeneuve d'Ascq

Alejandro Murua (Université de Montréal)

We develop a classification model for high dimensional data that takes into account two main problems in high-dimensions: the curse of the dimensionality and the empty space phenomenon. We overcome these obstacles by modeling the distribution of distances involving feature vectors instead of modeling directly the distribution of feature vectors. The model is based on the sphere-hardening result which states that, in high dimensions, data cluster in shells. Based on asymptotics on the dimension parameter, we show that under simple sampling conditions the distances of data points to their means are distributed as a variant of generalized gamma variables. We propose using mixtures of these distributions for both supervised and unsupervised classification of high-dimensional data. The paradigm is extended to low-dimensional data by embedding the data into higher-dimensional spaces by means of the kernel trick. Part of this work (a) has been done in collaboration with Bertrand Saulnier (Université de Montréal), and Nicolas Wicker (Université de Lille 1; Murua and Wicker, 2014), and (b) was inspired by a conversation with François Léonard (Hydro-Québec; Leonard and Gauvin, 2013).

Aucun document.

Big Data: Modeling, Estimation and Selection

Contact

High-dimensional data classification with mixtures of sphere-hardening distances

Grand Amphithéâtre

Ecole Centrale Lille

Orateur

Description

Documents de présentation