Description
Session chair: Konstantin Avrachenkov
-
Juaren Steiger (Queen's University)6/18/24, 4:00 PM
We consider network utility maximization for job admission, routing, and scheduling in a queueing network with unknown job utilities as a type of multi-armed bandit problem. This "Backlogged Bandit" problem is a bandit learning problem with delayed feedback due to the end-to-end delay of a job waiting in the queue of each node in its path through the network. While recent work has explored...
Go to contribution page -
Benedikt Meylahn (Korteweg-de Vries Institute for Mathematics, University of Amsterdam)6/18/24, 4:30 PM
We study the interpersonal trust of a population of agents, asking whether chance may decide if a population ends up in a high trust or low trust state. We model this by a discrete time, random matching stochastic coordination game. Agents are endowed with an exponential smoothing learning rule about the behaviour of their neighbours. We find that, with probability one in the long run the...
Go to contribution page -
Prof. Vivek Borkar (Indian Institute of Technology Bombay)6/18/24, 5:00 PM
We consider multiagent Q-learning with each agent having her
Go to contribution page
own reward function, but all agents influencing the transition
mechanism. By relaxing the exact optimality to a requirement of
`satisficing', modelled as driving the average costs to prescribed
acceptable regions, we propose a scheme that provably achieves this.