b2322f5908b3cd2c.tex
1: \begin{abstract}
2: The Maximum Mean Discrepancy (MMD) was successfully used as a loss functional to train generative models. In a non-parametric setting, the MMD can also be used as a loss function to learn distributions using optimal transport theory.
3: In this work, we construct a Wasserstein gradient flow of the MMD and provide an algorithm to simulate such flow. The proposed algorithm is based on a space-time discretization of the theoretical gradient flow of the MMD and aims at finding the best probability distribution that is close to the data as much as possible. We analyze the convergence properties of the gradient flow towards a global optimum and provide a simple algorithmic fix to improve convergence. We also show that the discretized algorithm approaches the gradient flow of the MMD as the sample size increases.
4: \end{abstract}
5: