BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//CERN//INDICO//EN
BEGIN:VEVENT
SUMMARY:Big Data\, myths & opportunities for the consumer finance industry
DTSTART;VALUE=DATE-TIME:20160609T145000Z
DTEND;VALUE=DATE-TIME:20160609T153500Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2658@indico.math.cnrs.fr
DESCRIPTION:Speakers: Iuri Paixao (BNP Paribas)\, Khalid Saad-Zaghloul (BN
P Paribas)\n\nThe digital edge\, which offers access to a wide variety of
structured and non structured data\, in a large volumes\, is transforming
the consumer finance industry. BNP Paribas Personal Finance\, European lea
der of the industry and introducer of scoring techniques in Europe\, is en
gaged in this transformation. The presentation will start with a vision of
the digital transformation of our processes\, how the data management (in
the sense of treatment\, modelling and operational use) is the strategic
lever\, how the alliance between technology and analytics can sustain bett
er the business development. Behind the buzz\, the presentation will focus
on business opportunities\, offered by Big Data techniques\, for the cons
umer finance industry.\n\nhttps://indico.math.cnrs.fr/event/830/contributi
ons/2658/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2658/
END:VEVENT
BEGIN:VEVENT
SUMMARY:What can we learn from modelling millions of patient records? A ma
chine learning perspective
DTSTART;VALUE=DATE-TIME:20160610T070000Z
DTEND;VALUE=DATE-TIME:20160610T074500Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2659@indico.math.cnrs.fr
DESCRIPTION:Speakers: Norman Poh (University of Surrey)\n\nIncreasing heal
thcare cost coupled with an ageing population in both developing and devel
oped worlds means that it is important to understand disease demographic p
rofiles in order to better optimize resources for quality health and care.
By using Chronic Kidney Disease (CKD) as a case study\, I will present ch
allenges that are related to understanding\, modelling and predicting the
progression of CKD\; and how machine learning techniques can be used to so
lve them. Examples include calibration of estimated Glomerular Filtration
Rate (eGFR)\, modelling of eGFR\, automatic selection clinically relevant
variables\, and non-linear dimensionality reduction for data discovery.\n\
nhttps://indico.math.cnrs.fr/event/830/contributions/2659/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2659/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Invariance principles for robust learning. An illustration with re
current neural networks
DTSTART;VALUE=DATE-TIME:20160610T074500Z
DTEND;VALUE=DATE-TIME:20160610T083000Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2660@indico.math.cnrs.fr
DESCRIPTION:Speakers: Yann Ollivier (Paris-Sud University)\n\nThe optimiza
tion methods used to learn models of data are often not invariant under si
mple changes in the representation of data or of intermediate variables. F
or instance\, for neural networks\, using neural activities in [0\;1] or i
n [-1\;1] can lead to very different final performance even though the two
representations are isomorphic. Here we show how information theory\, tog
ether with a Riemannian geometric viewpoint emphasizing independence from
the details of data representation\, leads to new\, scalable algorithms fo
r training models of sequential data\, which detect more complex patterns
and use fewer training samples.\nFor the talk\, no familiarity will be ass
umed with Riemannian geometry\, neural networks\, information theory\, or
statistical learning.\n\nhttps://indico.math.cnrs.fr/event/830/contributio
ns/2660/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2660/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Advances and open questions for neural networks
DTSTART;VALUE=DATE-TIME:20160609T121500Z
DTEND;VALUE=DATE-TIME:20160609T130000Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2661@indico.math.cnrs.fr
DESCRIPTION:Speakers: Jérémie Mary (University of Lille)\n\nSince 2010\,
under the name " Deep Learning"\, neural networks are more and more popul
ar and register some success in a wide range of applications : computer Go
\, image and sound categorization\, artificial Go\, dialog\,… This tutor
ial is a global presentation of the underlying techniques including stocha
stic gradient descents and convolutional networks. Some links with wavelet
s decompositions and open question will be presented as well as some demon
strations of use on pictures and texts.\n\nhttps://indico.math.cnrs.fr/eve
nt/830/contributions/2661/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2661/
END:VEVENT
BEGIN:VEVENT
SUMMARY:On the Properties of Variational Approximations of Gibbs Posterior
s
DTSTART;VALUE=DATE-TIME:20160610T120000Z
DTEND;VALUE=DATE-TIME:20160610T124500Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2662@indico.math.cnrs.fr
DESCRIPTION:Speakers: Pierre Alquier (ENSAE)\n\nPAC-Bayesian bounds are us
eful tools to control the prediction risk of aggregated estimators. When d
ealing with the exponentially weighted aggregate (EWA)\, these bounds lead
in some settings to the proof that the predictions are minimax-optimal. E
WA is usually computed through Monte Carlo methods. However\, in many prac
tical applications\, the computational cost of Monte Carlo methods is proh
ibitive. It is thus tempting to replace these by (faster) optimization alg
orithms that aim at approximating EWA: we will refer to these methods as v
ariational Bayes (VB) methods.\n\nIn this talk I will show\, thanks to a P
AC-Bayesian theorem\, that VB approximations are well founded\, in the sen
se that the loss incurred in terms of prevision risk is negligible in some
classical settings such as linear classification\, ranking... These appro
ximations are implemented in the R package pac-vb (written by James Ridgwa
y) that I will briefly introduce. I will especially insist on the the proo
f of the PAC-Bayesian theorem in order to explain how this result can be e
xtended to other settings.\n\nhttps://indico.math.cnrs.fr/event/830/contri
butions/2662/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2662/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Which analytic methods for Big Data?
DTSTART;VALUE=DATE-TIME:20160609T113000Z
DTEND;VALUE=DATE-TIME:20160609T121500Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2663@indico.math.cnrs.fr
DESCRIPTION:Speakers: Gilbert Saporta (CNAM Paris)\n\nWith massive data \,
there is no sampling errors : statistical tests and confidence intervals
become useless. Generative models are often less important than predictive
models. Closed form and parcimonious models are replaced by algorithms. S
tatistical Learning Theory initiated by V.Vapnik and the late A.Chervonenk
is provides the conceptual framework for machine learning algorithms. The
use of blackbox models including ensemble models is a challenge for scient
ific users since their interpretability is quite difficult. We will conclu
de by the necessity of combining statistical and machine learning tools wi
th causal inference to get better predictions and avoid the confussion bet
ween correlation and causality.\n\nhttps://indico.math.cnrs.fr/event/830/c
ontributions/2663/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2663/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Approximate Bayesian inference for large datasets
DTSTART;VALUE=DATE-TIME:20160610T135000Z
DTEND;VALUE=DATE-TIME:20160610T143500Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2664@indico.math.cnrs.fr
DESCRIPTION:Speakers: Nial Friel (Dublin University)\n\nLight and Widely A
pplicable (LWA-) MCMC is a novel approximation of the Metropolis-Hastings
kernel targeting a posterior distribution defined on a large number of obs
ervations. Inspired by Approximate Bayesian Computation\, we design a Mark
ov chain whose transition makes use of an unknown but fixed fraction of th
e available data\, where the random choice of sub-sample is guided by the
fidelity of this sub-sample to the observed data\, as measured by summary
(or sufficient) statistics. LWA-MCMC is a generic and flexible approach\,
as illustrated by the diverse set of examples which we explore. In each ca
se LWA-MCMC yields excellent performance and in some cases a dramatic impr
ovement compared to existing methodologies.\n\nhttps://indico.math.cnrs.fr
/event/830/contributions/2664/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2664/
END:VEVENT
BEGIN:VEVENT
SUMMARY:High-dimensional data classification with mixtures of sphere-harde
ning distances
DTSTART;VALUE=DATE-TIME:20160610T085000Z
DTEND;VALUE=DATE-TIME:20160610T093500Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2665@indico.math.cnrs.fr
DESCRIPTION:Speakers: Alejandro Murua (Université de Montréal)\n\nWe dev
elop a classification model for high dimensional data that takes into acco
unt two main problems in high-dimensions: the curse of the dimensionality
and the empty space phenomenon. We overcome these obstacles by modeling th
e distribution of distances involving feature vectors instead of modeling
directly the distribution of feature vectors. The model is based on the sp
here-hardening result which states that\, in high dimensions\, data cluste
r in shells.\n\nBased on asymptotics on the dimension parameter\, we show
that under simple sampling conditions the distances of data points to thei
r means are distributed as a variant of generalized gamma variables. We pr
opose using mixtures of these distributions for both supervised and unsupe
rvised classification of high-dimensional data. The paradigm is extended t
o low-dimensional data by embedding the data into higher-dimensional space
s by means of the kernel trick.\n\nPart of this work (a) has been done in
collaboration with Bertrand Saulnier (Université de Montréal)\, and Nico
las Wicker (Université de Lille 1\; Murua and Wicker\, 2014)\, and (b) wa
s inspired by a conversation with François Léonard (Hydro-Québec\; Leon
ard and Gauvin\, 2013).\n\nhttps://indico.math.cnrs.fr/event/830/contribut
ions/2665/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2665/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Machine Learning approaches for stock management in the retail ind
ustry
DTSTART;VALUE=DATE-TIME:20160609T140500Z
DTEND;VALUE=DATE-TIME:20160609T145000Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2666@indico.math.cnrs.fr
DESCRIPTION:Speakers: Manuel Davy (Vékia)\n\nhttps://indico.math.cnrs.fr/
event/830/contributions/2666/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2666/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Stochastic optimization and high-dimensional sampling: when Moreau
inf-convolution meets Langevin diffusion
DTSTART;VALUE=DATE-TIME:20160610T124500Z
DTEND;VALUE=DATE-TIME:20160610T133000Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2667@indico.math.cnrs.fr
DESCRIPTION:Speakers: Eric Moulines (Télécom ParisTech)\n\nRecently\, th
e problem of designing MCMC samplers adapted to high-dimensional Bayesian
inference with sensible theoretical guarantees has received a lot of int
erest. The applications are numerous\, including large-scale inference in
machine learning\, Bayesian nonparametrics\, Bayesian inverse problem\, ag
gregation of experts among others. When the density is L-smooth (the log-d
ensity is continuously differentiable and its derivative is Lipshitz)\, we
will advocate the use of a “rejection-free” algorithm\, based on the
discretization of the Euler diffusion with either constant or decreasing s
tepsizes. We will present several new results allowing convergence to stat
ionarity under different conditions for the log-density (from the weakest\
, bounded oscillations on a compact set and super-exponential in the tails
to the strong concavity). When the log-density is not smooth (a problem w
hich typically appears when using sparsity inducing priors for example)\,
we still suggest to use a Euler discretization but of the Moreau envelo
pe of the non-smooth part of the log-density. An importance sampling corr
ection may be later applied to correct the target. Several numerical illus
trations will be presented to show that this algorithm (named MYULA) can b
e practically used in a high dimensional setting. Finally\, non-asymptotic
bounds convergence bounds (in total variation and Wasserstein distances)
are derived.\n\nhttps://indico.math.cnrs.fr/event/830/contributions/2667/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2667/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Reuse of big data in healthcare: presentation\, transformation and
analyze of the data extracted from electronic health records
DTSTART;VALUE=DATE-TIME:20160609T130000Z
DTEND;VALUE=DATE-TIME:20160609T134500Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2668@indico.math.cnrs.fr
DESCRIPTION:Speakers: Emmanuel Chazard (Université Lille 2)\n\nRoutine ca
re of the hospitalized patients enables to generate and store huge amounts
of data. Typical datasets are made of medico-administrative data includin
g encoded diagnoses and procedures\, laboratory results\, drug administrat
ions and free-text reports. The exploitation of those data rises issues of
data quality\, confidentiality\, data aggregation\, and expert interpreta
tion. Due to the structure of those data (for instance\, each inpatient st
ay may have 1 to n diagnostic codes\, among about 35\,000 possible codes)\
, the data aggregation process has a critical impact on the analysis. This
aggregation requires skills in programming and statistics\, but also a de
ep knowledge of the data collection process and the medical analysis.\nThi
s presentation will also show 3 examples of successful data mining and dat
a reuse: adverse drug events detection and prevention\, scheduling of pati
ents admission in elective surgery\, and hospital billing improvement.\n\n
https://indico.math.cnrs.fr/event/830/contributions/2668/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2668/
END:VEVENT
BEGIN:VEVENT
SUMMARY:Construction of tight wavelet-like frames on graphs for denoising
DTSTART;VALUE=DATE-TIME:20160610T093500Z
DTEND;VALUE=DATE-TIME:20160610T102000Z
DTSTAMP;VALUE=DATE-TIME:20221207T011600Z
UID:indico-contribution-2669@indico.math.cnrs.fr
DESCRIPTION:Speakers: Gilles Blanchard (University of Potsdam)\n\nWe const
ruct a frame (redundant dictionary) for the space of real-valued functions
defined on a neighborhood graph constructed from data points. This frame
is adapted to the underlying geometrical structure (e.g. the points belong
to an unknown low dimensional manifold)\, has finitely many elements\, an
d these elements are localized in frequency as well as in space. This cons
truction follows the ideas of Hammond et al. (2011)\, with the key point t
hat we construct a tight (or Parseval) frame. This means we have a very si
mple\, explicit reconstruction formula for every functiondefined on the gr
aph from the coefficients given by its scalar product with the frame eleme
nts. We use this representation in the setting of denoising where we are g
iven noisy observations of a functiondefined on the graph. By applying a t
hresholding method to the coefficients in the reconstruction formula\, we
define an estimate ofwhose risk satisfies a tight oracle inequality.\n\nht
tps://indico.math.cnrs.fr/event/830/contributions/2669/
LOCATION:Grand Amphithéâtre (Ecole Centrale Lille)
URL:https://indico.math.cnrs.fr/event/830/contributions/2669/
END:VEVENT
END:VCALENDAR