f4a63c4e1736796a.tex
1: \begin{abstract}
2: In this article we study the stochastic gradient descent (SGD) optimization method 
3: in the training of fully-connected feedforward artificial neural networks 
4: with ReLU activation. The main result of this work proves that the risk of 
5: the SGD process converges to zero if the target function under consideration 
6: is constant. In the established convergence result the considered artificial 
7: neural networks consist of one input layer, one hidden layer, and one output 
8: layer (with $d \in \N$ neurons on the input layer, $\width \in \N$ neurons on the hidden layer, and one neuron on the output layer). The learning rates of the SGD process are assumed to be sufficiently small 
9: and the input data used in the SGD process to train the artificial neural networks 
10: is assumed to be independent and identically distributed.
11: \end{abstract}
12: