1: \begin{abstract}
2: In this paper, we investigate the limiting behavior of a
3: continuous-time counterpart of the Stochastic Gradient Descent (SGD)
4: algorithm applied to two-layer overparameterized neural networks, as
5: the number or neurons (\ie, the size of the hidden layer)
6: $N \to \plusinfty$. Following a probabilistic approach, we show
7: `propagation of chaos' for the particle system defined by this
8: continuous-time dynamics under different scenarios, indicating that
9: the statistical interaction between the particles asymptotically
10: vanishes. In particular, we establish quantitative convergence with
11: respect to $N$ of any particle to a solution of a mean-field
12: McKean-Vlasov equation in the metric space endowed with the
13: Wasserstein distance. In comparison to previous works on the
14: subject, we consider settings in which the sequence of stepsizes in
15: SGD can potentially depend on the number of neurons and the
16: iterations. We then identify two regimes under which different
17: mean-field limits are obtained, one of them corresponding to an
18: implicitly regularized version of the minimization problem at
19: hand. We perform various experiments on real datasets to validate
20: our theoretical results, assessing the existence of these two
21: regimes on classification problems and illustrating our convergence
22: results.
23: \end{abstract}
24: