1e11fb6f69a3d5e4.tex
1: \begin{abstract}
2: The optimization problem behind neural networks is highly non-convex. Training with stochastic gradient
3: descent and variants requires careful parameter tuning and provides no guarantee to achieve the global
4: optimum. In contrast we show under quite weak assumptions on the data that a particular class of feedforward 
5: neural networks can be trained globally optimal with a linear convergence rate with our nonlinear spectral method. Up to our knowledge this is
6: the first practically feasible method which achieves such a guarantee. While the method can in principle be
7: applied to deep networks, we restrict ourselves for simplicity in this paper to one and two hidden layer networks.
8: Our experiments confirm that these models are rich enough to achieve good performance on a series
9: of real-world datasets.
10: \end{abstract}
11: