1: \begin{abstract}
2: We study the overparametrization bounds required for the global convergence of stochastic gradient descent algorithm for a class of one hidden layer feed-forward neural networks equipped with ReLU activation function.
3: We improve the existing state-of-the-art results in terms of the required hidden layer width.
4: We introduce a new proof technique combining nonlinear analysis with properties of random initializations of the network.
5: \end{abstract}
6: