1: \begin{abstract}
2: In this paper, we first reinvestigate the convergence of vanilla SGD method under more general learning rates conditions and a more general convex assumption. Then, by taking advantage of the Lyapunov function technique, we present the convergence of the momentum SGD and Nesterov accelerated SGD methods for the convex and non-convex problem under $L$-smooth assumption, respectively.
3: The convergence of time averaged SGD was also analyzed.
4: \end{abstract}
5: