abstract:f57896f109532643.tex

1: \begin{abstract}

2:   A number of results have recently demonstrated the benefits

3:   of incorporating various constraints when training

4:   deep architectures in vision and machine learning.

5:   The advantages range from guarantees for statistical

6:   generalization to better accuracy to compression.

7:   But support for general constraints within widely

8:   used libraries remains scarce and their broader

9:   deployment within many applications that can benefit

10:   from them remains under-explored. Part of the reason is

11:   that Stochastic gradient descent (SGD), the workhorse for

12:   training deep neural networks, does not natively deal with constraints

13:   with global scope very well. In this paper, we revisit a classical first order

14:   scheme from numerical optimization, Conditional Gradients (CG), that has, thus far had limited applicability

15:   in training deep models. We show via rigorous

16:   analysis how various constraints can be naturally handled by  modifications

17:   of this algorithm. We provide convergence guarantees and show a suite of

18:   immediate benefits that are possible --- from training ResNets with fewer layers but better

19:   accuracy simply by substituting in  our version of CG to faster training of GANs with 50\% fewer

20:   epochs in image inpainting applications to provably better generalization guarantees using

21:   efficiently implementable forms of recently proposed regularizers.

22:

23:   \textbf{Keywords:} Constrained Deep Learning, Conditional Gradient Algorithms, Path Norm

24: \end{abstract}

25: