3c97621d8052956f.tex
1: \begin{abstract}
2:   We investigate the training and performance of generative adversarial networks using the Maximum Mean Discrepancy (MMD) as critic, termed MMD GANs.
3:   As our main theoretical contribution,
4:   we clarify the situation with bias in GAN loss functions raised by recent work:
5:   we show that gradient estimators used in the optimization process for both MMD GANs and Wasserstein GANs are unbiased,
6:   but learning a discriminator based on samples leads to biased gradients for the generator parameters.
7:   We also discuss the issue of kernel choice for the MMD critic, and characterize the kernel corresponding to the energy distance used for the Cram\'er GAN critic.
8:   Being an integral probability metric, the MMD benefits from training strategies recently developed for Wasserstein GANs. In experiments,  the MMD GAN is able to employ a smaller critic network than the Wasserstein GAN, resulting in a simpler and faster-training
9:   algorithm with matching performance.
10:   We also propose an improved measure of GAN convergence, the \emph{Kernel Inception Distance}, and show how to use it to dynamically adapt learning rates during GAN training.
11: \end{abstract}
12: