1: \begin{abstract}
2: With a weighting scheme proportional to $t$, a traditional
3: stochastic gradient descent (SGD) algorithm achieves a high
4: probability convergence rate of $O(\kappa/T)$ for strongly convex
5: functions, instead of $O(\kappa \ln(T)/T)$. We also prove that an
6: accelerated SGD algorithm also achieves a rate of $O(\kappa/T)$.
7: \end{abstract}
8: