abstract:11ac3ffa32010576.tex

1: \begin{abstract}

2: 	We study the overparametrization bounds required for the global convergence of stochastic gradient descent algorithm for a class of one hidden layer feed-forward neural networks equipped with ReLU activation function.

3: 	We improve the existing state-of-the-art results in terms of the required hidden layer width.

4: 	We introduce a new proof technique combining nonlinear analysis with properties of random initializations of the network.

5: \end{abstract}

6: