XVII th Conference on Stochastic Programming

Name: XVII th Conference on Stochastic Programming
Start: 2025-07-28T00:00:00+02:00
End: 2025-08-01T18:00:00+02:00
Location: No location set

28 juillet 2025 à 1 août 2025

Fuseau horaire Europe/Paris

Session

ML

28 juil. 2025, 14:00

F206

ML: Exploring the synergy between stochastic optimization, dynamics, sampling, inference, and optimal transport II

Taiji Suzuki (The University of Tokyo / RIKEN AIP)

ML: Contributed talks

Vladimir Norkin (V.M.Glushkov Institute of Cybernetics)

ML: Functional Models In Stochastic Optimization

Andrzej Ruszczynski (Rutgers University)

ML: Contributed talks

Sebastian Maier

ML: Machine Learning and Off-Policy Learning Robust Distribution Shifts

Phebe Vayanos (University of Southern California)

ML: Exploring the synergy between stochastic optimization, dynamics, sampling, inference, and optimal transport I

Jia-Jie Zhu (Weierstrass Institute, Berlin)

ML: Contributed talks

Philip Thompson (FGV-EMAp)

ML: Neural networks with combinatorial optimization layers: encoding policies for (multistage) stochastic combinatorial optimization

Léo Baty (Ecole des Ponts)

ML: Stochastic gradient methods for reinforcement learning and optimal control

Caleb Ju (Georgia Tech)

Aucun document.

268. Convergence theory and application of distribution optimization: Non-convexity, particle approximation, and diffusion models

Taiji Suzuki (The University of Tokyo / RIKEN AIP)

28/07/2025 14:00

Machine learning

Invited talk

In this talk, I will talk about some recent results on distribution optimization methods. First, we will talk about a non-convex optimization technique for a Wasserstein gradient flow (WGF). While WGF is guaranteed to converge to a first-order stationary point, for nonconvex functionals the converged solution does not necessarily satisfy the second-order optimality condition; i.e., it could...

91. Asymptotic log-Sobolev constants and the Polyak-Łojasiewicz gradient domination condition

Austin Stromme (ENSAE Paris)

28/07/2025 14:30

Machine learning

Invited talk

The Polyak-Łojasiewicz (PL) constant for a given function exactly characterizes the exponential rate of convergence of gradient flow uniformly over initializations, and has been of major recent interest in optimization and machine learning because it is strictly weaker than strong convexity yet implies many of the same results. In the world of sampling, the log-Sobolev inequality plays an...

140. Federated ADMM from Bayesian Duality

Thomas Möllenhoff (RIKEN Center for AI Project)

28/07/2025 15:00

Machine learning

Invited talk

ADMM is a popular method for federated deep learning which originated in the 1970s and, even though many new variants of it have been proposed since then, its core algorithmic structure has remained unchanged. In this talk, we present a new way to derive and extend federated ADMM. We propose to use a structure called Bayesian Duality which exploits a duality of the posterior distributions...

133. It's All in the Mix: Wasserstein Classification and Regression with Mixed Features

Mohammad Reza Belbasi (Imperial College Business School)

28/07/2025 16:30

Machine learning

Contributed talk

Problem definition: A key challenge in supervised learning is data scarcity, which can cause prediction models to overfit to the training data and perform poorly out of sample. A contemporary approach to combat overfitting is offered by distributionally robust problem formulations that consider all data-generating distributions close to the empirical distribution derived from historical...

120. On the Computation of General Constrained Wasserstein Barycenters

Gregorio Martinez Sempere (Minesparis-psl)

28/07/2025 17:00

Machine learning

This work explores constrained Wasserstein barycenters (CWB), extending the applicability of the classical Wasserstein barycenter (WB) to pre-required geometric, statistical, or constraints. While the WB problem has been extensively studied, constrained settings pose significant challenges, particularly for nonconvex constraint sets. Building upon the Method of Averaged Marginals (MAM), we...

225. On adaptive kernel learning

Vladimir Norkin (V.M.Glushkov Institute of Cybernetics)

28/07/2025 17:30

Machine learning

Contributed talk

The study proposes and explores a wide area of application of machine learning methods in applied mathematics, namely, parametric analysis of mathematical models by machine learning methods, in particular, by adaptive kernel support vector machines. The kernel support vector machine is extended by the ability to use a continuum (multivariate parametric) family of kernels to approximate the...

248. Generalized Naive Bayes with continuous explanatory variables

Edith Alice Kovács (Budapest University of Technology and Economics)

28/07/2025 18:00

Machine learning

Contributed talk

Naive Bayes is one of the most widely used machine learning algorithms, appreciated for its simplicity, efficiency, and ease of interpretation—qualities that make it appealing across various fields. However, Naive Bayes operates under the assumption that the explanatory variables are conditionally independent given the class label, an assumption that often does not hold true in practice.

To...

233. Optimal mass transportation relaxations of stochastic dominance relations in optimization

Darinka Dentcheva (Stevens Institute of Technology)

29/07/2025 10:45

Machine learning

Invited talk

Optimization problem with stochastic dominance constraints provide a possibility to shape risk by selecting a benchmark random outcome with a desired distribution. A difficulty arises when no feasible decision results in a distribution that dominates the benchmark. In this talk, we address the problem of choosing a tight relaxation of the stochastic dominance constraint by selecting a feasible...

227. A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization

Andrzej Ruszczynski (Rutgers University)

29/07/2025 11:15

Machine learning

Invited talk

We consider stochastic optimization problems involving an expected value of a nonlinear function of a base random vector and a conditional expectation of another function depending on the base random vector, a dependent random vector, and the decision variables. We call such problems conditional stochastic optimization problems. They arise in many applications, such as uplift modeling,...

247. Measures of stochastic non-dominance

Milos Kopa

29/07/2025 11:45

Applications in energy, finance or logistics

Contributed talk

Stochastic dominance rules are well-characterized and widely used. This work aims to describe and better understand the situations when they do not hold by developing measures of stochastic non-dominance. They quantify the error caused by assuming that one random variable dominates another one when it does not. To calculate them, we search for a hypothetical random variable that satisfies the...

166. Guaranteed bounds for optimal stopping problems using kernel-based non-asymptotic uniform confidence bands

Sebastian Maier (University College London)

29/07/2025 14:00

Machine learning

Contributed talk

In this paper, we introduce an approach for obtaining probabilistically guaranteed upper and lower bounds on the true optimal value of stopping problems. Bounds of existing simulation-and-regression approaches, such as those based on least squares Monte Carlo and information relaxation, are stochastic in nature and therefore do not come with a finite sample guarantee. Our data-driven approach...

250. Data-Driven Modeling of Endogenous Uncertainty in Optimization

Dr Luigi Gallo (Department of Mechanical, Energy, and Management Engineering, University of Calabria, Italy)

29/07/2025 14:30

Machine learning

Contributed talk

Decision-makers often face complex problems under uncertainty. Statistical and Machine Learning (ML) tools can support these decisions by reducing the lack of knowledge. However, in many real-world scenarios, the uncertainty itself is induced by the decisions taken. In such cases, standard ML models that provide a priori insights of the problem environment become ineffective. In the context of...

168. Federated Learning with Sample Selection: A Robust Optimization Approach

Xuan Vinh Doan (The University of Warwick)

29/07/2025 15:00

Machine learning

Contributed talk

Federated learning concerns training global models in a decentralized manner. Federated learning is important in many applications, especially when training data come from different sources that cannot be shared with the central server due to restrictions on data sharing. With the increasing capacity of data sources, training samples are usually collected and stored on a regular basis, which...

265. Reformulating Chance-Constrained Optimization as Neural Network Learning

Dawen WU

29/07/2025 15:30

Machine learning

Contributed talk

Chance-constrained optimization (CCO) problems are stochastic optimization problems with probabilistic constraints defined by a confidence level $\alpha$. A standard solution approach is to transform the CCO problem into a deterministic optimization problem, which is then solved by a numerical solver. However, this approach becomes computationally expensive when dealing with multiple...

155. Mixed-feature Logistic Regression Robust to Distribution Shifts

Prof. Phebe Vayanos (University of Southern California)

30/07/2025 10:45

Machine learning

Invited talk

Logistic regression models are widely used in the social and behavioral sciences and in high-stakes domains, due to their simplicity and interpretability properties. At the same time, such domains are permeated by distribution shifts, where the distribution generating the data changes between training and deployment. In this paper, we study a distributionally robust logistic regression problem...

167. Learning optimal prescriptive trees robust to distribution shifts

Daniël Vos (Delft University of Technology)

30/07/2025 11:15

Stochastic Programming

Invited talk

In domains such as personalized medicine, historical data is used to learn what treatments to prescribe to maximize positive outcomes. Previous studies have proposed methods for creating prescriptive trees: human-interpretable diagrams that indicate what type of treatment an individual should get based on their measurements. However, a remaining problem is that the models perform worse over...

165. Learning Optimal Classification Trees Robust to Distribution Shifts

Nathan Justin (University of Southern California)

30/07/2025 11:45

Stochastic Programming

Invited talk

We consider the problem of learning classification trees that are robust to distribution shifts between training and testing/deployment data. This problem arises frequently in high stakes settings such as public health and social work where data is often collected using self-reported surveys which are highly sensitive to e.g., the framing of the questions, the time when and place where the...

260. Inexact Column Generation for Causal Discovery

Rui Chen (Chinese University of Hong Kong, Shenzhen)

30/07/2025 12:15

Machine learning

Invited talk

Causal structure learning, the task of inferring causal relationships from data, is computationally challenging due to its combinatorial nature. State-of-the-art integer programming formulations suffer from exponential growth in the number of variables and constraints, while traditional column generation approaches struggle with the complexity of solving mixed-integer nonlinear programming...

173. Score Learning under the Manifold Hypothesis: Theory and Implications for Data Science

Ya-Ping Hsieh (ETH Zürich)

31/07/2025 10:45

Machine learning

Invited talk

The score function plays a central role in modern generative modeling, particularly in diffusion models and related score-based methods. Despite its theoretical appeal, learning the score function in practice is notoriously difficult: it is sensitive to hyperparameter choices and prone to various forms of instability that often require ad hoc corrections.

In this talk, we study score...

249. A stochastic optimal transport approach to unstable free boundary problems

Young-Heon Kim (University of British Columbia)

31/07/2025 11:15

Machine learning

Invited talk

The supercooled Stefan problem describes freezing of supercooled water. In contrast to the Stefan problem that describes melting of ice, the supercooled problem exhibits unstable behaviour which makes the usual PDE methods break down. We will discuss some recent progress which employs a stochastic version of optimal transport, involving optimal stopping times of the Brownian motion.

202. Contrasting and combining Wasserstein and Fisher-Rao flows for relative entropy minimization

Jia-Jie Zhu (Weierstrass Institute, Berlin)

31/07/2025 11:45

Machine learning

Invited talk

Many problems in machine learning can be framed as variational problems that minimize the relative entropy between two probability measures. Many recent works have exploited the connection between the (Otto-)Wasserstein gradient flow of the Kullback-–Leibler (KL) divergence and various sampling, Bayesian inference, and generative modeling algorithms. In this talk, I will first contrast the...

269. On stochastic mirror descent under relative nonconvexity

Dr Philip Thompson (FGV-EMAp)

01/08/2025 10:45

Machine learning

Contributed talk

The notions of relative (weak) convexity and variation (e.g., Lipschitzness and smoothness) have been successfully applied to some optimization problems, including applications in machine learning. While typically harder to prove, these properties encode better dependence of the objective with respect to the intrinsic geometry of the problem. We review previous analysis of the mirror descent...

35. Linear Convergence Rate in Convex Setup is Possible! First- and Zero-Order Algorithms under Generalized Smoothness

Aleksandr Lobanov (MIPT)

01/08/2025 11:15

Machine learning

Contributed talk

The gradient descent (GD) method -- is a fundamental and likely the most popular optimization algorithm in machine learning (ML), with a history traced back to a paper in 1847 (Cauchy, 1847). In this paper, we provide an improved convergence analysis of gradient descent and its variants, assuming generalized smoothness (L0,L1). In particular, we show that GD has the following behavior of...

126. A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging

Sajad Khodadadian (Virginia Tech)

01/08/2025 14:00

Machine learning

Invited talk

Polyak averaging is a well-known technique for achieving asymptotically optimal convergence in Stochastic Approximation. In this work, we establish the first high-probability bound for general Stochastic Approximation with Polyak Averaging. We take a black-box approach, assuming access to an anytime high-probability bound for a given Stochastic Approximation, and derive tight finite-time...

207. Structured Reinforcement Learning

Heiko Hoppe (Technical University of Munich)

01/08/2025 14:00

Structured learning and stochastic combinatorial optimization: methodological perspectives and applications

Invited talk

When facing contextual multi-stage optimization problems, training combinatorial optimization-enriched machine learning pipelines (ML-CO-pipelines) to date either requires imitating expert solutions or utilizing unstructured learning algorithms. While the former restricts the use of ML-CO-pipelines to problems with traceable offline solutions and relatively homogenous state spaces, the latter...

192. Primal-dual algorithm for multistage stochastic optimization

Solène Delannoy-Pavy (CERMICS / RTE)

01/08/2025 14:30

Structured learning and stochastic combinatorial optimization: methodological perspectives and applications

Invited talk

Multistage stochastic optimization (MSO) is pivotal for sequential decision-making under uncertainty, with prominent approaches in stochastic optimal control and reinforcement learning. While methods like Stochastic Dual Dynamic Programming excel with moderate-dimensional states and large continuous control spaces, and reinforcement learning handles large state spaces with smaller control...

131. Safe-EF: Error Feedback with Applications to Distributed Safe Reinforcement Learning

Ilyas Fatkhullin (ETH Zürich)

01/08/2025 14:30

Machine learning

Invited talk

Federated learning faces severe communication bottlenecks due to the high dimensionality of model updates. Communication compression with contractive compressors (e.g., Top-K) is often preferable in practice but can degrade performance without proper handling. Error feedback (EF) mitigates such issues but has been largely restricted to smooth, unconstrained problems, limiting its real-world...

215. Combinatorial optimization and decision-focused learning for stochastic tail assignment

Léo Baty (Ecole des Ponts)

01/08/2025 15:00

Structured learning and stochastic combinatorial optimization: methodological perspectives and applications

Invited talk

Network schedule optimization is one of the main applications of OR to the air transport industry. This process involves designing aircraft schedules that are both operationally feasible and cost-optimal. Tail assignment is the process of assigning specific aircraft (tails) to planned flights, typically occurring a few days to weeks before operations. Airlines routinely solve this problem...

179. Strongly-polynomial time and validation analysis of policy gradient methods

Caleb Ju (Georgia Tech)

01/08/2025 15:00

Machine learning

Invited talk

We propose a novel termination criterion, termed the advantage gap function, for finite state and action Markov decision processes (MDP) and reinforcement learning (RL). By incorporating this advantage gap function into the design of step size rules and deriving a new linear rate of convergence that is independent of the stationary state distribution of the optimal policy, we demonstrate that...

68. Annealed Sinkhorn for (Unbalanced) Optimal Transport

Lénaïc Chizat (EPFL)

Machine learning

Invited talk

Sinkhorn’s algorithm is the go-to method for solving large-scale optimal transport problems. While its theoretical foundations are rich and still expanding, a blind spot has persisted regarding the widely used heuristic of ϵ-scaling (temperature annealing), which lacks convergence guarantees. In this talk I will present such guarantees as well as theoretical insights into the design of...

206. Discrete Generative Modeling with Masked Diffusions

Jiaxin Shi (Google DeepMind)

Machine learning

Invited talk

Modern generative AI has developed along two distinct paths: autoregressive models for discrete data (such as text) and diffusion models for continuous data (like images). Bridging this divide by adapting diffusion models to handle discrete data represents a compelling avenue for unifying these disparate approaches. However, existing work in this area has been hindered by unnecessarily complex...

62. Federated Calculation of the Transportation Barycenter by Dual Decomposition

Zhengqi Lin

Machine learning

Invited talk

We propose an efficient federated dual decomposition algorithm for calculating the
Wasserstein barycenter of several distributions. The algorithm does not have
access to local data and uses only highly aggregate information. Owing to the
absence of any matrix-vector operations, the algorithm exhibits very low complexity of
each iteration and significant scalability. We illustrate its...

214. Global convergence of stochastic gradient bandits for any learning rates

Jincheng Mei (Google DeepMind)

Machine learning

Invited talk

We provide a new understanding of the stochastic gradient bandit algorithm by showing that it converges to a globally optimal policy almost surely using any constant learning rate. This result demonstrates that the stochastic gradient algorithm continues to balance exploration and exploitation appropriately even in scenarios where standard smoothness and noise control assumptions break...

143. Robust Offline Policy Learning Under Covariate Distribution Shifts

Cagil Kocyigit (University of Luxembourg)

Machine learning

Invited talk

We study the problem of learning a treatment assignment policy based on observable covariates, where there are potential shifts in the distribution of covariates from historical data (training) to deployment (test). We formulate a distributional robust policy optimization problem with the objective of maximizing the worst-case (out-of-sample) expected outcomes, considering all possible...

53. Smoothed Normalization for Efficient Distributed Private Optimization

Egor Shulgin (King Abdullah University of Science and Technology (KAUST))

Machine learning

Contributed talk

Federated learning enables training machine learning models while preserving the privacy of participants. Surprisingly, there is no differentially private distributed method for smooth, non-convex optimization problems. The reason is that standard privacy techniques require bounding the participants' contributions, usually enforced via clipping of the updates. Existing literature typically...

Ordre du jour en construction...

XVII th Conference on Stochastic Programming

Session

ML

F206

Présidents de session

ML: Exploring the synergy between stochastic optimization, dynamics, sampling, inference, and optimal transport II

ML: Contributed talks

ML: Functional Models In Stochastic Optimization

ML: Contributed talks

ML: Machine Learning and Off-Policy Learning Robust Distribution Shifts

ML: Exploring the synergy between stochastic optimization, dynamics, sampling, inference, and optimal transport I

ML: Contributed talks

ML: Neural networks with combinatorial optimization layers: encoding policies for (multistage) stochastic combinatorial optimization

ML: Stochastic gradient methods for reinforcement learning and optimal control

Documents de présentation

Choisissez le fuseau horaire

XVII th Conference on Stochastic Programming

Présidents de session

ML: Exploring the synergy between stochastic optimization, dynamics, sampling, inference, and optimal transport II

ML: Contributed talks

ML: Functional Models In Stochastic Optimization

ML: Contributed talks

ML: Machine Learning and Off-Policy Learning Robust Distribution Shifts

ML: Exploring the synergy between stochastic optimization, dynamics, sampling, inference, and optimal transport I

ML: Contributed talks

ML: Neural networks with combinatorial optimization layers: encoding policies for (multistage) stochastic combinatorial optimization

ML: Stochastic gradient methods for reinforcement learning and optimal control

Documents de présentation