Séminaire de Statistique et Optimisation

Finite-sample performance of the maximum likelihood estimator in logistic regression

par Jaouad Mourtada (ENSAE)

Europe/Paris
Salle K. Johnson (1R3, 1er étage)

Salle K. Johnson

1R3, 1er étage

Description
The logistic model is a classical linear model to describe the dependence of binary labels on multivariate features. We consider the predictive performance of the maximum likelihood estimator (MLE) for logistic regression, assessed in terms of logistic risk. We consider two questions: first, that of the existence of the MLE—which occurs when the dataset is not linearly separated—, and second that of its accuracy when it exists. These properties depend on both the dimension of features and on the signal strength.

In the case of Gaussian features and a well-specified logistic model, we describe sharp quantitative guarantees for the existence and prediction risk of the MLE. We then generalize these results in two ways: first, to non-Gaussian features satisfying a certain regularity condition, and second to the case where the labels no longer follow the logistic model.