Description
Organizer and chair: I-Hong Hou
Presentation materials
Combinatorial multi-armed bandit refers to the model that aims to maximize cumulative rewards in the presence of uncertainty. Motivated by two important wireless network applications, in addition to maximizing cumulative rewards, it is important to ensure fairness among arms (i.e., the minimum average reward required by each arm) and reward regularity (i.e., how often each arm receives the...
We study the data packet transmission problem (mmDPT) in dense cell-free millimeter wave (mmWave) networks, i.e., users sending data packet requests to access points (APs) via uplinks and APs transmitting requested data packets to users via downlinks. Our objective is to minimize the average delay in the system due to APs' limited service capacity and unreliable wireless channels between APs...
Abstract: Reinforcement Learning has demonstrated tremendous success in many challenging tasks with superhuman performance. Nevertheless, many of the decision-making problems in network optimization/scheduling naturally involve the participation of multiple decision-making agents (e.g., a network of routers/switches and a group of decentralized controllers) and thus need to be modeled as...