Description
Organizers and chairs: Lei Ying and Weina Wang
-
Xinyun Chen (The Chinese University of Hong Kong, Shenzhen)6/17/24, 3:30 PM
We investigate an online learning and optimization problem in a queueing system having unknown arrival rates and service-time distribution. The service provider’s objective is to seek the optimal service fee $p$ and service capacity $\mu$ so as to maximize the cumulative expected profit (the service revenue minus the capacity cost and delay penalty). We develop an online learning algorithm is...
Go to contribution page -
Weina Wang (Carnegie Mellon University)6/17/24, 4:00 PM
We consider the infinite-horizon, average reward restless bandit problem. For this problem, a central challenge is to find asymptotically optimal policies in a computationally efficient manner in the regime where the number of arms, N, grows large. Existing policies, including the renowned Whittle index policy, all rely on a uniform global attractor property (UGAP) assumption to achieve...
Go to contribution page -
Chen Yan6/17/24, 4:30 PM
We explore a general reinforcement learning framework within a Markov decision process (MDP) consisting of a large number $N$ of independent sub-MDPs, linked by global constraints. In the non-learning scenario, when the model meets a specific non-degenerate condition, efficient algorithms (i.e., polynomial in $N$) exist, achieving a performance gap smaller than $\sqrt{N}$ relative to the...
Go to contribution page