Regularization for Optimal Transport and Dynamic Time Warping Distances
Machine learning deals with mathematical objects that have structure. Two common structures arising in applications are point clouds / histograms, as well as time series. Early progress in optimization (linear and dynamic programming) have provided powerful families of distances between these structures, namely Wasserstein distances and dynamic time warping scores. Because they rely both on the minimization of a linear functional over a (discrete) space of alignments and a continuous set of couplings respectively, both result in non-differentiable quantities. We show how two distinct smoothing strategies result in quantities that are better behaved and more suitable for machine learning applications, with applications to the computation of Fréchet means.