abstract:a60a563bd69492fb.tex

1: \begin{abstract}%

2: % \looseness=-1

3: %     The success of neural networks is believed to be in part due to the learning of representations adapted to the structure of the problem and allowing to predict the correct output. We study the relationship between the symmetries of the problem and the structure of the learned predictor for infinitely wide two-layer ReLU networks trained with gradient flow. We show that the orthogonal symmetries of the target function $f^*$ are shared by the learned predictor under gradient flow. In particular, when $f^*$ is odd, these symmetries lead to a linear predictor for which exponential convergence to the global minimum can be obtained. When $f^*$ has a hidden low-dimensional structure, we prove that the gradient flow PDE reduces to a lower-dimensional PDE. Furthermore, we present informal and numerical arguments pointing towards the adaption of predictor to the lower-dimensional structure of the problem.

4: % \end{abstract}

5: