1: \begin{abstract}
2: This paper proposes a thorough theoretical analysis of Stochastic Gradient
3: Descent (SGD) with non-increasing step sizes. First, we show that the
4: recursion defining SGD can be provably approximated by solutions of a time
5: inhomogeneous Stochastic Differential Equation (SDE) using an appropriate
6: coupling. In the specific case of a batch noise we refine our results using
7: recent advances in Stein's method. Then, motivated by recent analyses of
8: deterministic and stochastic optimization methods by their continuous
9: counterpart, we study the long-time behavior of the continuous processes at
10: hand and establish non-asymptotic bounds. To that purpose, we develop new
11: comparison techniques which are of independent interest. Adapting these
12: techniques to the discrete setting, we show that the same results hold for the
13: corresponding SGD sequences. In our analysis, we notably improve
14: non-asymptotic bounds in the convex setting for SGD under weaker assumptions
15: than the ones considered in previous works. Finally, we also establish
16: finite-time convergence results under various conditions, including
17: relaxations of the famous \L ojasiewicz inequality, which can be applied to a
18: class of non-convex functions.
19: \end{abstract}
20: