Rencontres Statistiques Lyonnaises

On machine-learning methods for the estimation of conditional Kendall’s tau

par Alexis Derumigny (Université de Twente)

Europe/Paris
Fokko du Cloux (Braconnier)

Fokko du Cloux

Braconnier

Description

Conditional Kendall’s tau is a measure of dependence between two random variables, conditionally on some covariates. We study three different approaches for the estimation of this conditional dependence parameter : kernel techniques, regression-type models and classification algorithms. In the first part, we give analogs of usual statistical results (exponential bounds in probability, consistency, asymptotic normality) for the kernel-based estimator. Then, we assume a regression-type relationship between conditional Kendall’s tau and some covariates, in a parametric setting with a large number of transformations of a small number of regressors. This model may be sparse, and the underlying parameter is estimated through a penalized criterion. We prove non-asymptotic bounds with explicit constants that hold with high probabilities. We derive the consistency of a two-step estimator, its asymptotic law and some oracle properties. In the third part, we show how the problem of estimating conditional Kendall’s tau can be rewritten as a classification task. The goal is to predict whether the pair is concordant (value of 1) or discordant (value of -1) conditionally on some covariates. The consistency and the asymptotic normality of a family of penalized approximate maximum likelihood estimators is proven, including the equivalent of the logit and probit regressions in our framework. We detail specific algorithms, adapting usual machine learning techniques including nearest neighbors, decision trees, random forests and neural networks, to the setting of the estimation of conditional Kendall’s tau. Finite sample properties of all of these estimators and their sensitivities to each component of the data-generating process are assessed in a simulation study. Finally, these estimators are applied to a dataset of European stock indices during and after the European debt crisis.