abstract:2505c88d7f71116f.tex

1: \begin{abstract}

2: In the mean field regime, neural networks are appropriately scaled

3: so that as the width tends to infinity, the learning dynamics tends

4: to a nonlinear and nontrivial dynamical limit, known as the mean field

5: limit. This lends a way to study large-width neural networks via analyzing

6: the mean field limit. Recent works have successfully applied such

7: analysis to two-layer networks and provided global convergence guarantees.

8: The extension to multilayer ones however has been a highly challenging

9: puzzle, and little is known about the optimization efficiency in the

10: mean field regime when there are more than two layers.

11:

12: In this work, we prove a global convergence result for unregularized

13: feedforward three-layer networks in the mean field regime. We first

14: develop a rigorous framework to establish the mean field limit of

15: three-layer networks under stochastic gradient descent training. To

16: that end, we propose the idea of a \textit{neuronal embedding}, which

17: comprises of a fixed probability space that encapsulates neural networks

18: of arbitrary sizes. The identified mean field limit is then used to

19: prove a global convergence guarantee under suitable regularity and

20: convergence mode assumptions, which -- unlike previous works on two-layer

21: networks -- does not rely critically on convexity. Underlying the

22: result is a universal approximation property, natural of neural networks,

23: which importantly is shown to hold at \textit{any} finite training

24: time (not necessarily at convergence) via an algebraic topology argument.

25: \end{abstract}

26: