abstract:ad286d64b8c5ea44.tex

1: \begin{abstract}

2: We describe a layer-by-layer algorithm for training deep convolutional

3: networks, where each step involves gradient updates for a two layer

4: network followed by a simple clustering algorithm. Our algorithm stems

5: from a deep generative model that generates images level by level,

6: where lower resolution images correspond to latent semantic

7: classes. We analyze the convergence rate of our algorithm assuming

8: that the data is indeed generated according to this model (as well as

9: additional assumptions). While we do not pretend to claim that the

10: assumptions are realistic for natural images, we do believe that they

11: capture some true properties of real data. Furthermore, we show that

12: our algorithm actually works in practice (on the CIFAR dataset), achieving

13: results in the same ballpark as that of vanilla convolutional neural

14: networks that are being trained by stochastic gradient

15: descent. Finally, our proof techniques may be of independent interest.

16: \end{abstract}

17: