abstract:bf5cbbced9e0ef57.tex

1: \begin{abstract}

2: \looseness=-1

3: In this paper, we present a new strategy to prove the convergence of deep learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove

4: Kurdyka-\Loja{} inequalities for a broader set of neural network architectures and loss functions.

5: We show that Rayleigh quotients provide a unified view for several convergence analysis techniques in the literature. Our strategy produces a proof of convergence for various examples of parametric learning. In particular, our analysis does not require the number of parameters to tend to infinity, nor the number of samples to be finite, thus extending to test loss minimization and beyond the over-parameterized regime.

6:

7: %We present a new strategy to prove Kurdyka-\Loja{} inequalities, centered on Rayleigh quotients, to prove convergence of gradient flows to zero loss.

8: %We show that this ties together several convergence analysis techniques, and provide various examples of parametric learning for which this strategy produces a proof of convergence, with finitely many parameters and even in the presence of infinitely many samples, setting where the neural tangent kernel must have null eigenvalues, extending results from this line of research beyond the over-parameterized regime.

9: \end{abstract}

10: