1: \begin{abstract}
2: Learning algorithms for implicit generative models can optimize a
3: variety of criteria that measure how the data distribution differs
4: from the implicit model distribution, including the Wasserstein
5: distance, the Energy distance, and the Maximum Mean Discrepancy
6: criterion. A careful look at the geometries induced by these
7: distances on the space of probability measures reveals interesting
8: differences. In particular, we can establish surprising approximate
9: global convergence guarantees for the $1$-Wasserstein distance, even
10: when the parametric generator has a nonconvex parametrization.
11: \end{abstract}
12: