Orateur
Description
In this talk, I will talk about some recent results on distribution optimization methods. First, we will talk about a non-convex optimization technique for a Wasserstein gradient flow (WGF). While WGF is guaranteed to converge to a first-order stationary point, for nonconvex functionals the converged solution does not necessarily satisfy the second-order optimality condition; i.e., it could converge to a saddle point. To resolve this problem, we propose a new algorithm for probability measure optimization, perturbed Wasserstein gradient flow (PWGF), that achieves second-order optimality for general nonconvex objectives. PWGF enhances WGF by injecting noisy perturbations near saddle points via a Gaussian process-based scheme. We theoretically derive the computational complexity for PWGF to achieve a second-order stationary point and converge to a global optimum in polynomial time for strictly benign objectives.
Second, I present an improved error analysis corresponding to the propagation of chaos (PoC) for mean field Langevin dynamics (MFLD), where PoC provides a quantitative characterization of the approximation error in terms of the number of particles. In this study, we refine the defective log-Sobolev inequality---a key result from that earlier work---for a convex objective, and establish an improved PoC result that removes the exponential dependence on the regularization coefficient from the particle approximation term.
Third, if time permits, I introduce a novel alignment method for diffusion models from distribution optimization perspectives while providing rigorous convergence guarantees. The proposed method directly optimize the distribution using the Dual Averaging method and then adopt Doob's h-transform technique to generate sample from the optimal distribution. The proposed framework is supported by rigorous convergence guarantees and an end-to-end bound on the sampling error, which imply that when the original distribution's score is known accurately, the complexity of sampling from shifted distributions is independent of isoperimetric conditions.