abstract:b5bf6cf4fda3739b.tex

1: \begin{abstract}

2: Nowozin \textit{et al} showed

3: last year how to extend the GAN \textit{principle} to all $f$-divergences. The

4: approach is elegant but falls short of a full description of the supervised game, and

5: says little about the key player, the generator: for example,

6: what does the generator actually converge to if solving the GAN game means convergence in some

7: space of parameters? How does that provide hints on the generator's design and

8: compare to the flourishing but almost exclusively experimental literature on the

9: subject?

10:

11: In this paper, we unveil a broad class of distributions for which such

12: convergence happens --- namely, deformed exponential families, a wide

13: superset of exponential families --- and show tight connections with the three

14: other key GAN parameters: loss, game and architecture. In particular, we show that current deep architectures are

15: able to factorize a very large number of

16: such densities using an especially compact design, hence displaying the power of deep architectures and their concinnity in

17: the $f$-GAN game. This result holds given a sufficient condition on

18: \textit{activation functions} ---  which turns out to be

19: satisfied by popular choices. The key to our results is a variational

20: generalization of an old theorem that relates the KL divergence between regular exponential

21: families and divergences between their natural

22: parameters. We complete this picture with additional results and experimental insights on

23: how these results may be used to ground further improvements of GAN

24: architectures, via (i) a principled design of the activation

25: functions in the generator and (ii) an explicit integration of proper composite losses' link function in the discriminator.

26: \end{abstract}

27: