1: \begin{abstract}
2: In this article we study the stochastic gradient descent (SGD) optimization method
3: in the training of fully-connected feedforward artificial neural networks
4: with ReLU activation. The main result of this work proves that the risk of
5: the SGD process converges to zero if the target function under consideration
6: is constant. In the established convergence result the considered artificial
7: neural networks consist of one input layer, one hidden layer, and one output
8: layer (with $d \in \N$ neurons on the input layer, $\width \in \N$ neurons on the hidden layer, and one neuron on the output layer). The learning rates of the SGD process are assumed to be sufficiently small
9: and the input data used in the SGD process to train the artificial neural networks
10: is assumed to be independent and identically distributed.
11: \end{abstract}
12: