Séminaire de Statistique et Optimisation

Conformal Prediction with Missing Values

par Margaux Zaffran (INRIA, CMAP)

Europe/Paris
Salle K. Johnson, 1er étage (1R3)

Salle K. Johnson, 1er étage

1R3

Description
Uncertainty quantification of predictive models is crucial in decision-making problems. Conformal Prediction (CP) is a theoretically grounded framework for constructing prediction intervals with finite sample distribution-free marginal coverage guarantee for any underlying machine learning model. The presence of missing values in real data brings additional challenges to uncertainty quantification. Despite an abundant literature on missing data, as far as we know, there is no work studying the quantification of uncertainty in predictive models.
 
In this talk, we will first introduce in details CP, along with its limitations and current active research directions. Then, we will study conformal prediction with missing covariates. We first show that the marginal coverage guarantee of conformal prediction holds on imputed data for any missingness distribution and almost all imputation functions. However, we emphasize that the average coverage varies depending on the pattern of missing values: conformal methods tend to construct prediction intervals that under-cover the response conditionally to some missing patterns. This motivates our novel generalized conformalized quantile regression framework, missing data augmentation, which yields prediction intervals that are valid conditionally to the patterns of missing values, despite their exponential number. Using synthetic and data from critical medical care, we corroborate our theory and report improved performance of our methods.