11ac3ffa32010576.tex
1: \begin{abstract}
2: 	We study the overparametrization bounds required for the global convergence of stochastic gradient descent algorithm for a class of one hidden layer feed-forward neural networks equipped with ReLU activation function. 
3: 	We improve the existing state-of-the-art results in terms of the required hidden layer width. 
4: 	We introduce a new proof technique combining nonlinear analysis with properties of random initializations of the network.
5: \end{abstract}
6: