28 juillet 2025 à 1 août 2025
Fuseau horaire Europe/Paris

Session

ML

ML
28 juil. 2025, 14:00
F206

F206

Présidents de session

ML: Exploring the synergy between stochastic optimization, dynamics, sampling, inference, and optimal transport II

  • Taiji Suzuki (The University of Tokyo / RIKEN AIP)

ML: Contributed talks

  • Vladimir Norkin (V.M.Glushkov Institute of Cybernetics)

ML: Functional Models In Stochastic Optimization

  • Andrzej Ruszczynski (Rutgers University)

ML: Contributed talks

  • Sebastian Maier

ML: Machine Learning and Off-Policy Learning Robust Distribution Shifts

  • Phebe Vayanos (University of Southern California)

ML: Exploring the synergy between stochastic optimization, dynamics, sampling, inference, and optimal transport I

  • Jia-Jie Zhu (Weierstrass Institute, Berlin)

ML: Contributed talks

  • Philip Thompson (FGV-EMAp)

ML: Neural networks with combinatorial optimization layers: encoding policies for (multistage) stochastic combinatorial optimization

  • Léo Baty (Ecole des Ponts)

ML: Stochastic gradient methods for reinforcement learning and optimal control

  • Caleb Ju (Georgia Tech)

Documents de présentation

Aucun document.

  1. Taiji Suzuki (The University of Tokyo / RIKEN AIP)
    28/07/2025 14:00
    Machine learning
    Invited talk

    In this talk, I will talk about some recent results on distribution optimization methods. First, we will talk about a non-convex optimization technique for a Wasserstein gradient flow (WGF). While WGF is guaranteed to converge to a first-order stationary point, for nonconvex functionals the converged solution does not necessarily satisfy the second-order optimality condition; i.e., it could...

    Aller à la page de la contribution
  2. Austin Stromme (ENSAE Paris)
    28/07/2025 14:30
    Machine learning
    Invited talk

    The Polyak-Łojasiewicz (PL) constant for a given function exactly characterizes the exponential rate of convergence of gradient flow uniformly over initializations, and has been of major recent interest in optimization and machine learning because it is strictly weaker than strong convexity yet implies many of the same results. In the world of sampling, the log-Sobolev inequality plays an...

    Aller à la page de la contribution
  3. Thomas Möllenhoff (RIKEN Center for AI Project)
    28/07/2025 15:00
    Machine learning
    Invited talk

    ADMM is a popular method for federated deep learning which originated in the 1970s and, even though many new variants of it have been proposed since then, its core algorithmic structure has remained unchanged. In this talk, we present a new way to derive and extend federated ADMM. We propose to use a structure called Bayesian Duality which exploits a duality of the posterior distributions...

    Aller à la page de la contribution
  4. Mohammad Reza Belbasi (Imperial College Business School)
    28/07/2025 16:30
    Machine learning
    Contributed talk

    Problem definition: A key challenge in supervised learning is data scarcity, which can cause prediction models to overfit to the training data and perform poorly out of sample. A contemporary approach to combat overfitting is offered by distributionally robust problem formulations that consider all data-generating distributions close to the empirical distribution derived from historical...

    Aller à la page de la contribution
  5. Gregorio Martinez Sempere (Minesparis-psl)
    28/07/2025 17:00
    Machine learning

    This work explores constrained Wasserstein barycenters (CWB), extending the applicability of the classical Wasserstein barycenter (WB) to pre-required geometric, statistical, or constraints. While the WB problem has been extensively studied, constrained settings pose significant challenges, particularly for nonconvex constraint sets. Building upon the Method of Averaged Marginals (MAM), we...

    Aller à la page de la contribution
  6. Vladimir Norkin (V.M.Glushkov Institute of Cybernetics)
    28/07/2025 17:30
    Machine learning
    Contributed talk

    The study proposes and explores a wide area of application of machine learning methods in applied mathematics, namely, parametric analysis of mathematical models by machine learning methods, in particular, by adaptive kernel support vector machines. The kernel support vector machine is extended by the ability to use a continuum (multivariate parametric) family of kernels to approximate the...

    Aller à la page de la contribution
  7. Edith Alice Kovács (Budapest University of Technology and Economics)
    28/07/2025 18:00
    Machine learning
    Contributed talk

    Naive Bayes is one of the most widely used machine learning algorithms, appreciated for its simplicity, efficiency, and ease of interpretation—qualities that make it appealing across various fields. However, Naive Bayes operates under the assumption that the explanatory variables are conditionally independent given the class label, an assumption that often does not hold true in practice.

    To...

    Aller à la page de la contribution
  8. Darinka Dentcheva (Stevens Institute of Technology)
    29/07/2025 10:45
    Machine learning
    Invited talk

    Optimization problem with stochastic dominance constraints provide a possibility to shape risk by selecting a benchmark random outcome with a desired distribution. A difficulty arises when no feasible decision results in a distribution that dominates the benchmark. In this talk, we address the problem of choosing a tight relaxation of the stochastic dominance constraint by selecting a feasible...

    Aller à la page de la contribution
  9. Andrzej Ruszczynski (Rutgers University)
    29/07/2025 11:15
    Machine learning
    Invited talk

    We consider stochastic optimization problems involving an expected value of a nonlinear function of a base random vector and a conditional expectation of another function depending on the base random vector, a dependent random vector, and the decision variables. We call such problems conditional stochastic optimization problems. They arise in many applications, such as uplift modeling,...

    Aller à la page de la contribution
  10. Milos Kopa
    29/07/2025 11:45
    Applications in energy, finance or logistics
    Contributed talk

    Stochastic dominance rules are well-characterized and widely used. This work aims to describe and better understand the situations when they do not hold by developing measures of stochastic non-dominance. They quantify the error caused by assuming that one random variable dominates another one when it does not. To calculate them, we search for a hypothetical random variable that satisfies the...

    Aller à la page de la contribution
  11. Sebastian Maier (University College London)
    29/07/2025 14:00
    Machine learning
    Contributed talk

    In this paper, we introduce an approach for obtaining probabilistically guaranteed upper and lower bounds on the true optimal value of stopping problems. Bounds of existing simulation-and-regression approaches, such as those based on least squares Monte Carlo and information relaxation, are stochastic in nature and therefore do not come with a finite sample guarantee. Our data-driven approach...

    Aller à la page de la contribution
  12. Dr Luigi Gallo (Department of Mechanical, Energy, and Management Engineering, University of Calabria, Italy)
    29/07/2025 14:30
    Machine learning
    Contributed talk

    Decision-makers often face complex problems under uncertainty. Statistical and Machine Learning (ML) tools can support these decisions by reducing the lack of knowledge. However, in many real-world scenarios, the uncertainty itself is induced by the decisions taken. In such cases, standard ML models that provide a priori insights of the problem environment become ineffective. In the context of...

    Aller à la page de la contribution
  13. Xuan Vinh Doan (The University of Warwick)
    29/07/2025 15:00
    Machine learning
    Contributed talk

    Federated learning concerns training global models in a decentralized manner. Federated learning is important in many applications, especially when training data come from different sources that cannot be shared with the central server due to restrictions on data sharing. With the increasing capacity of data sources, training samples are usually collected and stored on a regular basis, which...

    Aller à la page de la contribution
  14. Dawen WU
    29/07/2025 15:30
    Machine learning
    Contributed talk

    Chance-constrained optimization (CCO) problems are stochastic optimization problems with probabilistic constraints defined by a confidence level $\alpha$. A standard solution approach is to transform the CCO problem into a deterministic optimization problem, which is then solved by a numerical solver. However, this approach becomes computationally expensive when dealing with multiple...

    Aller à la page de la contribution
  15. Prof. Phebe Vayanos (University of Southern California)
    30/07/2025 10:45
    Machine learning
    Invited talk

    Logistic regression models are widely used in the social and behavioral sciences and in high-stakes domains, due to their simplicity and interpretability properties. At the same time, such domains are permeated by distribution shifts, where the distribution generating the data changes between training and deployment. In this paper, we study a distributionally robust logistic regression problem...

    Aller à la page de la contribution
  16. Daniël Vos (Delft University of Technology)
    30/07/2025 11:15
    Stochastic Programming
    Invited talk

    In domains such as personalized medicine, historical data is used to learn what treatments to prescribe to maximize positive outcomes. Previous studies have proposed methods for creating prescriptive trees: human-interpretable diagrams that indicate what type of treatment an individual should get based on their measurements. However, a remaining problem is that the models perform worse over...

    Aller à la page de la contribution
  17. Nathan Justin (University of Southern California)
    30/07/2025 11:45
    Stochastic Programming
    Invited talk

    We consider the problem of learning classification trees that are robust to distribution shifts between training and testing/deployment data. This problem arises frequently in high stakes settings such as public health and social work where data is often collected using self-reported surveys which are highly sensitive to e.g., the framing of the questions, the time when and place where the...

    Aller à la page de la contribution
  18. Rui Chen (Chinese University of Hong Kong, Shenzhen)
    30/07/2025 12:15
    Machine learning
    Invited talk

    Causal structure learning, the task of inferring causal relationships from data, is computationally challenging due to its combinatorial nature. State-of-the-art integer programming formulations suffer from exponential growth in the number of variables and constraints, while traditional column generation approaches struggle with the complexity of solving mixed-integer nonlinear programming...

    Aller à la page de la contribution
  19. Ya-Ping Hsieh (ETH Zürich)
    31/07/2025 10:45
    Machine learning
    Invited talk

    The score function plays a central role in modern generative modeling, particularly in diffusion models and related score-based methods. Despite its theoretical appeal, learning the score function in practice is notoriously difficult: it is sensitive to hyperparameter choices and prone to various forms of instability that often require ad hoc corrections.

    In this talk, we study score...

    Aller à la page de la contribution
  20. Young-Heon Kim (University of British Columbia)
    31/07/2025 11:15
    Machine learning
    Invited talk

    The supercooled Stefan problem describes freezing of supercooled water. In contrast to the Stefan problem that describes melting of ice, the supercooled problem exhibits unstable behaviour which makes the usual PDE methods break down. We will discuss some recent progress which employs a stochastic version of optimal transport, involving optimal stopping times of the Brownian motion.

    Aller à la page de la contribution
  21. Jia-Jie Zhu (Weierstrass Institute, Berlin)
    31/07/2025 11:45
    Machine learning
    Invited talk

    Many problems in machine learning can be framed as variational problems that minimize the relative entropy between two probability measures. Many recent works have exploited the connection between the (Otto-)Wasserstein gradient flow of the Kullback-–Leibler (KL) divergence and various sampling, Bayesian inference, and generative modeling algorithms. In this talk, I will first contrast the...

    Aller à la page de la contribution
  22. Dr Philip Thompson (FGV-EMAp)
    01/08/2025 10:45
    Machine learning
    Contributed talk

    The notions of relative (weak) convexity and variation (e.g., Lipschitzness and smoothness) have been successfully applied to some optimization problems, including applications in machine learning. While typically harder to prove, these properties encode better dependence of the objective with respect to the intrinsic geometry of the problem. We review previous analysis of the mirror descent...

    Aller à la page de la contribution
  23. Aleksandr Lobanov (MIPT)
    01/08/2025 11:15
    Machine learning
    Contributed talk

    The gradient descent (GD) method -- is a fundamental and likely the most popular optimization algorithm in machine learning (ML), with a history traced back to a paper in 1847 (Cauchy, 1847). In this paper, we provide an improved convergence analysis of gradient descent and its variants, assuming generalized smoothness (L0,L1). In particular, we show that GD has the following behavior of...

    Aller à la page de la contribution
  24. Sajad Khodadadian (Virginia Tech)
    01/08/2025 14:00
    Machine learning
    Invited talk

    Polyak averaging is a well-known technique for achieving asymptotically optimal convergence in Stochastic Approximation. In this work, we establish the first high-probability bound for general Stochastic Approximation with Polyak Averaging. We take a black-box approach, assuming access to an anytime high-probability bound for a given Stochastic Approximation, and derive tight finite-time...

    Aller à la page de la contribution
  25. Heiko Hoppe (Technical University of Munich)
    01/08/2025 14:00
    Structured learning and stochastic combinatorial optimization: methodological perspectives and applications
    Invited talk

    When facing contextual multi-stage optimization problems, training combinatorial optimization-enriched machine learning pipelines (ML-CO-pipelines) to date either requires imitating expert solutions or utilizing unstructured learning algorithms. While the former restricts the use of ML-CO-pipelines to problems with traceable offline solutions and relatively homogenous state spaces, the latter...

    Aller à la page de la contribution
  26. Solène Delannoy-Pavy (CERMICS / RTE)
    01/08/2025 14:30
    Structured learning and stochastic combinatorial optimization: methodological perspectives and applications
    Invited talk

    Multistage stochastic optimization (MSO) is pivotal for sequential decision-making under uncertainty, with prominent approaches in stochastic optimal control and reinforcement learning. While methods like Stochastic Dual Dynamic Programming excel with moderate-dimensional states and large continuous control spaces, and reinforcement learning handles large state spaces with smaller control...

    Aller à la page de la contribution
  27. Ilyas Fatkhullin (ETH Zürich)
    01/08/2025 14:30
    Machine learning
    Invited talk

    Federated learning faces severe communication bottlenecks due to the high dimensionality of model updates. Communication compression with contractive compressors (e.g., Top-K) is often preferable in practice but can degrade performance without proper handling. Error feedback (EF) mitigates such issues but has been largely restricted to smooth, unconstrained problems, limiting its real-world...

    Aller à la page de la contribution
  28. Léo Baty (Ecole des Ponts)
    01/08/2025 15:00
    Structured learning and stochastic combinatorial optimization: methodological perspectives and applications
    Invited talk

    Network schedule optimization is one of the main applications of OR to the air transport industry. This process involves designing aircraft schedules that are both operationally feasible and cost-optimal. Tail assignment is the process of assigning specific aircraft (tails) to planned flights, typically occurring a few days to weeks before operations. Airlines routinely solve this problem...

    Aller à la page de la contribution
  29. Caleb Ju (Georgia Tech)
    01/08/2025 15:00
    Machine learning
    Invited talk

    We propose a novel termination criterion, termed the advantage gap function, for finite state and action Markov decision processes (MDP) and reinforcement learning (RL). By incorporating this advantage gap function into the design of step size rules and deriving a new linear rate of convergence that is independent of the stationary state distribution of the optimal policy, we demonstrate that...

    Aller à la page de la contribution
  30. Lénaïc Chizat (EPFL)
    Machine learning
    Invited talk

    Sinkhorn’s algorithm is the go-to method for solving large-scale optimal transport problems. While its theoretical foundations are rich and still expanding, a blind spot has persisted regarding the widely used heuristic of ϵ-scaling (temperature annealing), which lacks convergence guarantees. In this talk I will present such guarantees as well as theoretical insights into the design of...

    Aller à la page de la contribution
  31. Jiaxin Shi (Google DeepMind)
    Machine learning
    Invited talk

    Modern generative AI has developed along two distinct paths: autoregressive models for discrete data (such as text) and diffusion models for continuous data (like images). Bridging this divide by adapting diffusion models to handle discrete data represents a compelling avenue for unifying these disparate approaches. However, existing work in this area has been hindered by unnecessarily complex...

    Aller à la page de la contribution
  32. Zhengqi Lin
    Machine learning
    Invited talk

    We propose an efficient federated dual decomposition algorithm for calculating the
    Wasserstein barycenter of several distributions. The algorithm does not have
    access to local data and uses only highly aggregate information. Owing to the
    absence of any matrix-vector operations, the algorithm exhibits very low complexity of
    each iteration and significant scalability. We illustrate its...

    Aller à la page de la contribution
  33. Jincheng Mei (Google DeepMind)
    Machine learning
    Invited talk

    We provide a new understanding of the stochastic gradient bandit algorithm by showing that it converges to a globally optimal policy almost surely using any constant learning rate. This result demonstrates that the stochastic gradient algorithm continues to balance exploration and exploitation appropriately even in scenarios where standard smoothness and noise control assumptions break...

    Aller à la page de la contribution
  34. Cagil Kocyigit (University of Luxembourg)
    Machine learning
    Invited talk

    We study the problem of learning a treatment assignment policy based on observable covariates, where there are potential shifts in the distribution of covariates from historical data (training) to deployment (test). We formulate a distributional robust policy optimization problem with the objective of maximizing the worst-case (out-of-sample) expected outcomes, considering all possible...

    Aller à la page de la contribution
  35. Egor Shulgin (King Abdullah University of Science and Technology (KAUST))
    Machine learning
    Contributed talk

    Federated learning enables training machine learning models while preserving the privacy of participants. Surprisingly, there is no differentially private distributed method for smooth, non-convex optimization problems. The reason is that standard privacy techniques require bounding the participants' contributions, usually enforced via clipping of the updates. Existing literature typically...

    Aller à la page de la contribution
Ordre du jour en construction...