abstract:d661c9442c8367e5.tex

1: \begin{abstract}: We consider the minimization of an objective

2:   function given access to unbiased estimates of its gradient through

3:   stochastic gradient descent (SGD) with constant step-size. While the

4:   detailed analysis was only performed for quadratic functions, we

5:   provide an explicit asymptotic expansion of the moments of the

6:   averaged SGD iterates that outlines the dependence on initial

7:   conditions, the effect of noise and the step-size, as well as the

8:   lack of convergence in the general (non-quadratic) case. For this

9:   analysis, we bring tools from Markov chain theory into the analysis

10:   of stochastic gradient.  We then show that Richardson-Romberg

11:   extrapolation may be used to get closer to the global optimum and we

12:   show empirical improvements of the new extrapolation scheme.

13: \end{abstract}

14: