f31c16cb4201f460.tex
1: \begin{abstract}
2:   We construct a Wasserstein gradient flow of the maximum mean discrepancy (MMD) and study its convergence properties.
3:   The MMD is an integral probability metric defined for a reproducing kernel Hilbert space (RKHS), and serves as a metric on probability measures for a sufficiently rich RKHS.  We obtain conditions for convergence of the gradient flow towards a global optimum, that can be related to particle transport when optimizing neural networks.
4:   We also propose a way to regularize this MMD flow, based on an injection of noise in the gradient. This algorithmic fix comes with theoretical and empirical evidence.
5: The practical implementation of the flow is straightforward, since both the MMD and its gradient have simple closed-form expressions, which can be easily estimated with samples. 
6:   
7: \end{abstract}
8: