-
6/17/24, 11:00 AM
Abstract: We will consider policy optimization methods in reinforcement learning where the state space is countably infinite. The motivation arises from control problems in communication networks and matching markets. We consider an algorithm called Natural Policy Gradient (NPG), which is a popular algorithm for finite state spaces, and show three results in the context of countable state...
Go to contribution page
Choose timezone
Your profile timezone: