abstract:17c00a2e77afb9d4.tex

1: \begin{abstract}

2:   With a weighting scheme proportional to $t$, a traditional

3:   stochastic gradient descent (SGD) algorithm achieves a high

4:   probability convergence rate of $O(\kappa/T)$ for strongly convex

5:   functions, instead of $O(\kappa \ln(T)/T)$. We also prove that an

6:   accelerated SGD algorithm also achieves a rate of $O(\kappa/T)$.

7: \end{abstract}

8: