Pierre Catoire - Missing values in prediction: do we need "Missing at Random" ?
par
amphi Louis
ISPED
Speaker: Pierre Catoire from BPH
Title: Missing values in prediction: do we need "Missing at Random" ?
Abstract:
Missing values affect every statistical work. In inference, where the analysis aims at describing the relationships between variables, missing data may alter the results and the "Missing at Random" (MAR) assumption is required to guarantee the absence of bias of parameter estimation induced by these missing values. It is commonly assumed that this condition is also necessary in prediction context, where the analysis aims at estimate the probability of an outcome given the observed value of predictors. However, recent results suggest that the MAR assumption is a sufficient, yet not necessary condition to obtain unbiased prediction. We suggest to solve this apparent conflict by remarking that when missing values may be present, two quantities can be considered to be predicted: the probability of the outcome given the value of the observed predictors ("Missingness-unconditional" probability, MU) and the probability of the outcome given the value of the observed predictors and their observation pattern ("Missingness-conditional" probability, MC). We identify conditions under which these two quantities are equal, and the conditions required to estimate each of the two without bias, both being weaker than MAR. This results in a new nomenclature of missingness mechanisms, dedicated to prediction, and different from the MAR classification essential in inference. We illustrate these results with applications on simulated data, assessing the performance of various missing data handling methods, and discuss their implications for estimation, validation and deployment of prediction models.
Calendar subscription link for the complete seminar series:
https://indico.math.cnrs.fr/category/711/events.ics
Program of the Biostatistics seminars:
https://indico.math.cnrs.fr/category/711/
Subscribe to the seminar mailing list:
https://diff.u-bordeaux.fr/sympa/subscribe/seminaire.biostat.bph
Former e-seminars on our YouTube channel (mostly in French): https://www.youtube.com/channel/UCURp-hEQL7k23UzGfqgEurA/videos
Biostatistics seminar series from the Department of Public Health from the University of Bordeaux and the Bordeaux Population Health UMR 1219 research center
Denis Rustand