Orateur
Description
A fundamental problem in the contextual optimization setting is to evaluate and compare the objective performances of data-driven decision rules. In this regard, we analyze the construction of interval estimates on these performances, as an approach to produce reliable evaluation that handles the underlying statistical uncertainty. Specifically, we systematically compare common plug-in and cross-validation approaches. Despite its wide usage, the statistical benefits of cross-validation approaches have remained half-understood, especially in challenging nonparametric regimes for contextual optimization. In this paper we fill in this gap and show that, in terms of estimating the out-of-sample performances, for a wide spectrum of models, CV does not statistically outperform the simple “plug-in'' approach where one reuses training data for testing evaluation, in terms of both the asymptotic bias and coverage accuracy of the associated interval for out-of-sample evaluation. Our numerical results demonstrate that plug-in performs indeed no worse than CV in estimating decision performance across a wide range of contextual optimization examples.