Reinforcement Learning for Stochastic Networks, Toulouse

Name: Reinforcement Learning for Stochastic Networks, Toulouse
Start: 2024-06-17T09:00:00+02:00
End: 2024-06-21T18:00:00+02:00
Location: ENSEEIHT

Jun 17 – 21, 2024

ENSEEIHT

Europe/Paris timezone

Non-Stationary Gradient Descent for Optimal Auto-Scaling in Serverless Platforms

Jun 21, 2024, 11:00 AM

30m

A001 (ENSEEIHT)

A001

ENSEEIHT

Parallel session: Reinforcement learning for real-life applications

Dr Bruno Gaujal (Inria)

To efficiently manage serverless computing platforms, a key aspect is the auto-scaling of services, i.e., the set of computational resources allocated to a service adapts over time as a function of the traffic demand. The objective is to find a compromise between user-perceived performance and energy consumption. In this paper, we consider the "scale-per-request" auto-scaling pattern and investigate how many function instances (or servers) should be spawned each time an unfortunate job arrives, i.e., a job that finds all servers busy upon its arrival. We address this problem by following a stochastic optimization approach: taking advantage of the ability to observe the system \emph{state} over time, we develop a stochastic gradient descent scheme of the Kiefer--Wolfowitz type. At each iteration, the proposed scheme computes an estimate of the number of servers to spawn each time an unfortunate job arrives to minimize some cost function. Under natural assumptions, we show that the sequence of estimates produced by our scheme is asymptotically optimal almost surely. In addition, we prove that its convergence rate is $O (n^{- 2 / 3})$ where $n$ is the number of iterations.

From a mathematical point of view, the stochastic optimization framework induced by auto-scaling exhibits non-standard aspects that we approach from a general point of view. We consider the setting where a controller can only get samples of the transient -- rather than stationary -- behavior of the underlying stochastic system. To handle this difficulty, we develop arguments that exploit properties of the mixing time of the underlying Markov chain. By means of numerical simulations, we validate the proposed approach and quantify its gain with respect to common existing scale-up rules.

Anselmi Jonatha (Inria) Dr Bruno Gaujal (Inria) Dr Louis-Sebastien Rebuffi (Univ. Grenoble Alpes)

There are no materials yet.

Reinforcement Learning for Stochastic Networks, Toulouse

Non-Stationary Gradient Descent for Optimal Auto-Scaling in Serverless Platforms

A001

ENSEEIHT

Speaker

Description

Primary authors

Presentation materials

Choose timezone

Reinforcement Learning for Stochastic Networks, Toulouse

Speaker

Description

Primary authors

Presentation materials