BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CERN//INDICO//EN
BEGIN:VEVENT
SUMMARY:Big Data\, myths & opportunities for the consumer finance industry
DTSTART;VALUE=DATE-TIME:20160609T145000Z
DTEND;VALUE=DATE-TIME:20160609T153500Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2658@indico.math.cnrs.fr
DESCRIPTION:Speakers: Iuri Paixao (BNP Paribas)\, Khalid Saad-Zaghloul (BN
P Paribas)\nThe digital edge\, which offers access to a wide variety of st
ructured and non structured data\, in a large volumes\, is transforming th
e consumer finance industry. BNP Paribas Personal Finance\, European leade
r of the industry and introducer of scoring techniques in Europe\, is enga
ged in this transformation. The presentation will start with a vision of t
he digital transformation of our processes\, how the data management (in t
he sense of treatment\, modelling and operational use) is the strategic le
ver\, how the alliance between technology and analytics can sustain better
the business development. Behind the buzz\, the presentation will focus o
n business opportunities\, offered by Big Data techniques\, for the consum
er finance industry.\n\nhttps://indico.math.cnrs.fr/event/830/contribution
s/2658/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2658/
END:VEVENT
BEGIN:VEVENT
SUMMARY:What can we learn from modelling millions of patient records? A ma
chine learning perspective
DTSTART;VALUE=DATE-TIME:20160610T070000Z
DTEND;VALUE=DATE-TIME:20160610T074500Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2659@indico.math.cnrs.fr
DESCRIPTION:Speakers: Norman Poh (University of Surrey)\nIncreasing health
care cost coupled with an ageing population in both developing and develop
ed worlds means that it is important to understand disease demographic pro
files in order to better optimize resources for quality health and care. B
y using Chronic Kidney Disease (CKD) as a case study\, I will present chal
lenges that are related to understanding\, modelling and predicting the pr
ogression of CKD\; and how machine learning techniques can be used to solv
e them. Examples include calibration of estimated Glomerular Filtration Ra
te (eGFR)\, modelling of eGFR\, automatic selection clinically relevant va
riables\, and non-linear dimensionality reduction for data discovery.\n\nh
ttps://indico.math.cnrs.fr/event/830/contributions/2659/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2659/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Invariance principles for robust learning. An illustration with re
current neural networks
DTSTART;VALUE=DATE-TIME:20160610T074500Z
DTEND;VALUE=DATE-TIME:20160610T083000Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2660@indico.math.cnrs.fr
DESCRIPTION:Speakers: Yann Ollivier (Paris-Sud University)\nThe optimizati
on methods used to learn models of data are often not invariant under simp
le changes in the representation of data or of intermediate variables. For
instance\, for neural networks\, using neural activities in [0\;1] or in
[-1\;1] can lead to very different final performance even though the two r
epresentations are isomorphic. Here we show how information theory\, toget
her with a Riemannian geometric viewpoint emphasizing independence from th
e details of data representation\, leads to new\, scalable algorithms for
training models of sequential data\, which detect more complex patterns an
d use fewer training samples.\nFor the talk\, no familiarity will be assum
ed with Riemannian geometry\, neural networks\, information theory\, or st
atistical learning.\n\nhttps://indico.math.cnrs.fr/event/830/contributions
/2660/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2660/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Advances and open questions for neural networks
DTSTART;VALUE=DATE-TIME:20160609T121500Z
DTEND;VALUE=DATE-TIME:20160609T130000Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2661@indico.math.cnrs.fr
DESCRIPTION:Speakers: Jérémie Mary (University of Lille)\nSince 2010\, u
nder the name " Deep Learning"\, neural networks are more and more popular
and register some success in a wide range of applications : computer Go\,
image and sound categorization\, artificial Go\, dialog\,… This tutoria
l is a global presentation of the underlying techniques including stochast
ic gradient descents and convolutional networks. Some links with wavelets
decompositions and open question will be presented as well as some demonst
rations of use on pictures and texts.\n\nhttps://indico.math.cnrs.fr/event
/830/contributions/2661/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2661/
END:VEVENT
BEGIN:VEVENT
SUMMARY:On the Properties of Variational Approximations of Gibbs Posterior
s
DTSTART;VALUE=DATE-TIME:20160610T120000Z
DTEND;VALUE=DATE-TIME:20160610T124500Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2662@indico.math.cnrs.fr
DESCRIPTION:Speakers: Pierre Alquier (ENSAE)\nPAC-Bayesian bounds are usef
ul tools to control the prediction risk of aggregated estimators. When dea
ling with the exponentially weighted aggregate (EWA)\, these bounds lead i
n some settings to the proof that the predictions are minimax-optimal. EWA
is usually computed through Monte Carlo methods. However\, in many practi
cal applications\, the computational cost of Monte Carlo methods is prohib
itive. It is thus tempting to replace these by (faster) optimization algor
ithms that aim at approximating EWA: we will refer to these methods as var
iational Bayes (VB) methods.\n\nIn this talk I will show\, thanks to a PAC
-Bayesian theorem\, that VB approximations are well founded\, in the sense
that the loss incurred in terms of prevision risk is negligible in some c
lassical settings such as linear classification\, ranking... These approxi
mations are implemented in the R package pac-vb (written by James Ridgway)
that I will briefly introduce. I will especially insist on the the proof
of the PAC-Bayesian theorem in order to explain how this result can be ext
ended to other settings.\n\nhttps://indico.math.cnrs.fr/event/830/contribu
tions/2662/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2662/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Which analytic methods for Big Data?
DTSTART;VALUE=DATE-TIME:20160609T113000Z
DTEND;VALUE=DATE-TIME:20160609T121500Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2663@indico.math.cnrs.fr
DESCRIPTION:Speakers: Gilbert Saporta (CNAM Paris)\nWith massive data \, t
here is no sampling errors : statistical tests and confidence intervals be
come useless. Generative models are often less important than predictive m
odels. Closed form and parcimonious models are replaced by algorithms. Sta
tistical Learning Theory initiated by V.Vapnik and the late A.Chervonenkis
provides the conceptual framework for machine learning algorithms. The us
e of blackbox models including ensemble models is a challenge for scientif
ic users since their interpretability is quite difficult. We will conclude
by the necessity of combining statistical and machine learning tools with
causal inference to get better predictions and avoid the confussion betwe
en correlation and causality.\n\nhttps://indico.math.cnrs.fr/event/830/con
tributions/2663/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2663/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Approximate Bayesian inference for large datasets
DTSTART;VALUE=DATE-TIME:20160610T135000Z
DTEND;VALUE=DATE-TIME:20160610T143500Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2664@indico.math.cnrs.fr
DESCRIPTION:Speakers: Nial Friel (Dublin University)\nLight and Widely App
licable (LWA-) MCMC is a novel approximation of the Metropolis-Hastings ke
rnel targeting a posterior distribution defined on a large number of obser
vations. Inspired by Approximate Bayesian Computation\, we design a Markov
chain whose transition makes use of an unknown but fixed fraction of the
available data\, where the random choice of sub-sample is guided by the fi
delity of this sub-sample to the observed data\, as measured by summary (o
r sufficient) statistics. LWA-MCMC is a generic and flexible approach\, as
illustrated by the diverse set of examples which we explore. In each case
LWA-MCMC yields excellent performance and in some cases a dramatic improv
ement compared to existing methodologies.\n\nhttps://indico.math.cnrs.fr/e
vent/830/contributions/2664/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2664/
END:VEVENT
BEGIN:VEVENT
SUMMARY:High-dimensional data classification with mixtures of sphere-harde
ning distances
DTSTART;VALUE=DATE-TIME:20160610T085000Z
DTEND;VALUE=DATE-TIME:20160610T093500Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2665@indico.math.cnrs.fr
DESCRIPTION:Speakers: Alejandro Murua (Université de Montréal)\nWe devel
op a classification model for high dimensional data that takes into accoun
t two main problems in high-dimensions: the curse of the dimensionality an
d the empty space phenomenon. We overcome these obstacles by modeling the
distribution of distances involving feature vectors instead of modeling di
rectly the distribution of feature vectors. The model is based on the sphe
re-hardening result which states that\, in high dimensions\, data cluster
in shells.\n\nBased on asymptotics on the dimension parameter\, we show th
at under simple sampling conditions the distances of data points to their
means are distributed as a variant of generalized gamma variables. We prop
ose using mixtures of these distributions for both supervised and unsuperv
ised classification of high-dimensional data. The paradigm is extended to
low-dimensional data by embedding the data into higher-dimensional spaces
by means of the kernel trick.\n\nPart of this work (a) has been done in co
llaboration with Bertrand Saulnier (Université de Montréal)\, and Nicola
s Wicker (Université de Lille 1\; Murua and Wicker\, 2014)\, and (b) was
inspired by a conversation with François Léonard (Hydro-Québec\; Leonar
d and Gauvin\, 2013).\n\nhttps://indico.math.cnrs.fr/event/830/contributio
ns/2665/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2665/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Machine Learning approaches for stock management in the retail ind
ustry
DTSTART;VALUE=DATE-TIME:20160609T140500Z
DTEND;VALUE=DATE-TIME:20160609T145000Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2666@indico.math.cnrs.fr
DESCRIPTION:Speakers: Manuel Davy (Vékia)\nhttps://indico.math.cnrs.fr/ev
ent/830/contributions/2666/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2666/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Stochastic optimization and high-dimensional sampling: when Moreau
inf-convolution meets Langevin diffusion
DTSTART;VALUE=DATE-TIME:20160610T124500Z
DTEND;VALUE=DATE-TIME:20160610T133000Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2667@indico.math.cnrs.fr
DESCRIPTION:Speakers: Eric Moulines (Télécom ParisTech)\nRecently\, the
problem of designing MCMC samplers adapted to high-dimensional Bayesian in
ference with sensible theoretical guarantees has received a lot of inter
est. The applications are numerous\, including large-scale inference in ma
chine learning\, Bayesian nonparametrics\, Bayesian inverse problem\, aggr
egation of experts among others. When the density is L-smooth (the log-den
sity is continuously differentiable and its derivative is Lipshitz)\, we w
ill advocate the use of a “rejection-free” algorithm\, based on the di
scretization of the Euler diffusion with either constant or decreasing ste
psizes. We will present several new results allowing convergence to statio
narity under different conditions for the log-density (from the weakest\,
bounded oscillations on a compact set and super-exponential in the tails t
o the strong concavity). When the log-density is not smooth (a problem whi
ch typically appears when using sparsity inducing priors for example)\, we
still suggest to use a Euler discretization but of the Moreau envelope
of the non-smooth part of the log-density. An importance sampling correc
tion may be later applied to correct the target. Several numerical illustr
ations will be presented to show that this algorithm (named MYULA) can be
practically used in a high dimensional setting. Finally\, non-asymptotic b
ounds convergence bounds (in total variation and Wasserstein distances) ar
e derived.\n\nhttps://indico.math.cnrs.fr/event/830/contributions/2667/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2667/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Reuse of big data in healthcare: presentation\, transformation and
analyze of the data extracted from electronic health records
DTSTART;VALUE=DATE-TIME:20160609T130000Z
DTEND;VALUE=DATE-TIME:20160609T134500Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2668@indico.math.cnrs.fr
DESCRIPTION:Speakers: Emmanuel Chazard (Université Lille 2)\nRoutine care
of the hospitalized patients enables to generate and store huge amounts o
f data. Typical datasets are made of medico-administrative data including
encoded diagnoses and procedures\, laboratory results\, drug administratio
ns and free-text reports. The exploitation of those data rises issues of d
ata quality\, confidentiality\, data aggregation\, and expert interpretati
on. Due to the structure of those data (for instance\, each inpatient stay
may have 1 to n diagnostic codes\, among about 35\,000 possible codes)\,
the data aggregation process has a critical impact on the analysis. This a
ggregation requires skills in programming and statistics\, but also a deep
knowledge of the data collection process and the medical analysis.\nThis
presentation will also show 3 examples of successful data mining and data
reuse: adverse drug events detection and prevention\, scheduling of patien
ts admission in elective surgery\, and hospital billing improvement.\n\nht
tps://indico.math.cnrs.fr/event/830/contributions/2668/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2668/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Construction of tight wavelet-like frames on graphs for denoising
DTSTART;VALUE=DATE-TIME:20160610T093500Z
DTEND;VALUE=DATE-TIME:20160610T102000Z
DTSTAMP;VALUE=DATE-TIME:20210508T202902Z
UID:indico-contribution-830-2669@indico.math.cnrs.fr
DESCRIPTION:Speakers: Gilles Blanchard (University of Potsdam)\nWe constru
ct a frame (redundant dictionary) for the space of real-valued functions d
efined on a neighborhood graph constructed from data points. This frame is
adapted to the underlying geometrical structure (e.g. the points belong t
o an unknown low dimensional manifold)\, has finitely many elements\, and
these elements are localized in frequency as well as in space. This constr
uction follows the ideas of Hammond et al. (2011)\, with the key point tha
t we construct a tight (or Parseval) frame. This means we have a very simp
le\, explicit reconstruction formula for every functiondefined on the grap
h from the coefficients given by its scalar product with the frame element
s. We use this representation in the setting of denoising where we are giv
en noisy observations of a functiondefined on the graph. By applying a thr
esholding method to the coefficients in the reconstruction formula\, we de
fine an estimate ofwhose risk satisfies a tight oracle inequality.\n\nhttp
s://indico.math.cnrs.fr/event/830/contributions/2669/
LOCATION:Ecole Centrale Lille Grand Amphithéâtre
URL:https://indico.math.cnrs.fr/event/830/contributions/2669/
END:VEVENT
END:VCALENDAR