Your profile timezone:
Speaker: Justine Remiat from Bordeaux Population health
Title: Random forests using longitudinal predictors
This seminar will be in English
Abstract: Random Forests (Breiman, 2001) are an effective predictive tool, particularly in high-dimensional settings. However, they are not well-suited for longitudinal data collected over time. To address this limitation, Fréchet Random Forests (Capitaine et al. 2020) were proposed. They can handle any type of data within a metric space by using a distance tailored to each data type (e.g., images, trajectories). This work aimed to implement the Fréchet Random Forest for trajectory data, fully exploiting the flexibility of the generalized discrete Fréchet distance; and evaluate the performance of the Fréchet Random Forest in predicting a continuous outcome using longitudinal inputs. The Generalized Discrete Fréchet Distance depends on a time-shifting parameter, called timescale, which modifies its behavior. We proposed two implementations: the time-scale defined as an hyper parameter or the time-scale randomly drawn at each tree node to explore all time sensitivity behaviors. A simulation study has been conducted to illustrate the flexibility of the Fréchet Random Forest to capture different scenarios of association: (i) time-sensitive association (ii) shape-sensitive association and (iii) a mix of both. We then apply the method to data from a population-based cohort to predict the risk of dementia from clinical marker trajectories. The simulations illustrated the flexibility of the Fréchet Random Forests to adapt to different types of associations with the timescale tuning. The Fréchet Random Forests also demonstrated better predictive performance (MSE) across all three scenarios compared to classical Random Forests with pre-determined features. On the application data, the Fréchet forests outperformed classical forests, even with more irregular and sparse data, while similarly identifying predictive markers. Thanks to its tunable timescale parameter that can adapt to different structures of association, the Fréchet Random Forest constitutes a flexible tool for prediction based on longitudinal data.
Calendar subscription link for the complete seminar series:
https://indico.math.cnrs.fr/category/711/events.ics
Program of the Biostatistics seminars:
https://indico.math.cnrs.fr/category/711/
Subscribe to the seminar mailing list:
https://diff.u-bordeaux.fr/sympa/subscribe/seminaire.biostat.bph
Former e-seminars on our YouTube channel (mostly in French): https://www.youtube.com/channel/UCURp-hEQL7k23UzGfqgEurA/videos
Biostatistics seminar series from the Department of Public Health from the University of Bordeaux and the Bordeaux Population Health UMR 1219 research center
Boris Hejblum