Choose timezone
Your profile timezone:
An A/B test evaluates the impact of new technology by implementing it in a real production environment and testing its performance on a set of users. Recent developments in A/B testing have focused on dynamic allocation using bandit models. These methods minimize the cost of the test while evaluating variations (A or B). However, dynamic allocation using bandit methods relies on some assumptions that may not always be true in reality, particularly in non-homogeneous user populations. This presentation introduces a new reinforcement learning methodology for dynamic allocation in A/B testing and discusses how to integrate evolutionary covariates for dynamic contextual allocation.