Description
Organizer and chair: R. Srikant and Yashaswini Murthy
We study a two-sided network where heterogeneous demand (customers) and heterogeneous supply (workers) arrive randomly over time to get matched. Customers and workers arrive with a randomly sampled patience time (also known as reneging time in the literature), and are lost if forced to wait longer than that time to be matched. The system dynamics depend on the matching policy, which determines...
Bayesian Optimization aims to optimize expensive black-box functions using minimal function evaluations. Its key idea is to strategically model the unknown function structure via a surrogate model and, importantly, quantify the associated uncertainty that allows a sequential search of query points to balance exploitation-exploration. While Gaussian process (GP) has been a flexible and favored...
How best to incorporate historical data for initializing control policies is an important open question for using RL in practice: more data should help get better performance, but naively initializing policies using historical samples can suffer from spurious data and imbalanced data coverage, leading to computational and storage issues. To get around this, we will propose a simple...