fdb2af9bc5921db8.tex
1: \begin{abstract}
2: A recent series of theoretical works showed that the dynamics of neural
3:   networks with a certain initialisation are well-captured by kernel methods.
4:   Concurrent empirical work demonstrated that kernel methods can come close to
5:   the performance of neural networks on some image classification
6:   tasks.%2 background
7:   These results raise the question of whether neural networks only learn
8:   successfully if kernels also learn successfully, despite neural nets being more expressive. %3 main question
9:   Here, we show theoretically that two-layer neural networks (2LNN) with
10:   only a few neurons can beat the performance of kernel learning on a simple Gaussian mixture classification task. %4 main result
11:   We study the high-dimensional limit,
12:   i.e.~when the number of samples is linearly proportional to the dimension, and
13:   show that while small 2LNN achieve near-optimal performance on this task, lazy
14:   training approaches such as random features and kernel methods do
15:   not.% 5 Detailed results 
16:   Our analysis is based on the derivation of a closed set of equations that track the learning dynamics of the 2LNN and thus allow
17:   to extract the asymptotic performance of the network as a function of
18:   signal-to-noise ratio and other hyperparameters. %6 Detailed results 2
19:   We finally illustrate how over-parametrising the neural network leads to faster
20:   convergence, but does not improve its final performance. % 7 Det. results 3
21: \end{abstract}
22: