a67dd8d3f92cbf99.tex
1: \begin{abstract}
2: \vspace{0.2cm}
3: Stochastic gradient descent (SGD) optimization algorithms are key ingredients in a series of machine learning applications. In this article we perform a rigorous strong error analysis for SGD optimization algorithms. In particular, we prove for every arbitrarily small $\varepsilon \in (0,\infty)$ and every arbitrarily large $p\in (0,\infty)$ that the considered SGD optimization algorithm converges in the strong $L^p$-sense with order $\nicefrac{1}{2}-\varepsilon$ to the global minimum of the objective function of the considered stochastic approximation problem under standard convexity-type assumptions on the objective function and relaxed assumptions on the moments of the stochastic errors appearing in the employed SGD optimization algorithm. 
4: %
5: %The key ingredient in our proofs is  to employ Lyapunov-type functions
6: %to first establish strong $L^2$-convergence rates and then to obtain strong $L^p$-convergence rates by performing an induction argument on $q\in\{2,4,6,\dots\}\cap[0,p]$.
7: %
8: The key ideas in our convergence proof are, first, to employ techniques from the theory of Lyapunov-type functions for dynamical systems to develop a general convergence machinery for SGD optimization algorithms based on such functions, then, to apply this general machinery to concrete Lyapunov-type functions with polynomial structures, and, thereafter, to perform an induction argument along the powers appearing in the Lyapunov-type functions in order to achieve for every arbitrarily large $ p \in (0,\infty) $ strong $ L^p $-convergence rates.
9: %
10: This article also contains an extensive review of results on SGD optimization algorithms in the scientific literature.
11: \end{abstract}
12: