abstract:ba86c94ad6c9a687.tex

1: \begin{abstract}

2: Linear networks provide  valuable insights into the workings of neural networks in general.

3: This paper identifies conditions under which the gradient flow provably trains a linear network, in spite of the non-strict saddle points present in the optimization landscape.

4: This paper also provides the computational complexity of training linear networks with gradient flow.

5: To achieve these results, this work develops a  machinery to provably identify the stable set of gradient flow, which then enables us to

6: improve over the state of the art in the literature of linear networks~\cite{bah2019learning,arora2018convergence}.

7: Crucially, our results appear to be the first to break away from the {lazy training} regime which has dominated the literature of neural networks.

8: This work requires the network to have \emph{a} layer with one neuron, which subsumes the  networks  with a scalar output, but extending the results of this theoretical work to all linear networks remains a challenging open problem.  \end{abstract}

9: