17c00a2e77afb9d4.tex
1: \begin{abstract}
2:   With a weighting scheme proportional to $t$, a traditional
3:   stochastic gradient descent (SGD) algorithm achieves a high
4:   probability convergence rate of $O(\kappa/T)$ for strongly convex
5:   functions, instead of $O(\kappa \ln(T)/T)$. We also prove that an
6:   accelerated SGD algorithm also achieves a rate of $O(\kappa/T)$.
7: \end{abstract}
8: