1: \begin{abstract}
2: We study convergence properties of Stochastic Gradient Descent (SGD)
3: for convex objectives without assumptions on smoothness or strict
4: convexity. We consider the question of establishing that with high
5: probability the objective evaluated at the
6: candidate minimizer returned by SGD is close to the minimal value of
7: the objective. We compare this result concerning the final candidate
8: minimzer (i.e.~the final model parameters learned after all gradient
9: steps) to the online learning techniques of
10: \cite{zinkevich-online-sgd} that take a rolling average of the model
11: parameters at the different steps of SGD.
12: \end{abstract}
13: