abstract:097d034919a0f38d.tex

1: \begin{abstract}

2: %     We prove tight bounds on the smallest eigenvalue of the NTK for deep ReLU networks.

3:     A recent line of work has analyzed the theoretical properties of deep neural networks via the Neural Tangent Kernel (NTK).

4:     In particular, the smallest eigenvalue of the NTK has been related to memorization capacity,

5:     convergence of gradient descent algorithms and generalization of deep nets.

6:     However, existing results either provide bounds in the two-layer setting or assume that the spectrum of the NTK

7:     is bounded away from 0 for multi-layer networks.

8:     In this paper, we provide tight bounds on the smallest eigenvalue of NTK matrices for deep ReLU networks,

9:     both in the limiting case of infinite widths and for finite widths.

10:     In the finite-width setting, the network architectures we consider are quite general:

11:     we require the existence of a wide layer with roughly order of $N$ neurons,

12:     $N$ being the number of data samples; and the scaling of the remaining widths is arbitrary (up to logarithmic factors).

13:     To obtain our results, we analyze various quantities of independent interest:

14:     we give lower bounds on the smallest singular value of feature matrices,

15:     and upper bounds on the Lipschitz constant of input-output feature maps.

16: \end{abstract}

17: