abstract:ae5b8b7adf5a5e06.tex

1: \begin{abstract}

2: Nesterov's accelerated gradient (AG)

3: method for minimizing a smooth strongly convex function $f$ is

4: known to reduce $f(\x_k)-f(\x^*)$ by a factor

5: of $\eps\in(0,1)$ after $k=O(\sqrt{L/\ell}\log(1/\eps))$ iterations, where

6: $\ell,L$ are the two parameters of smooth strong convexity. Furthermore,

7: it is known that this is the best possible complexity in the function-gradient oracle

8: model of computation.  Modulo a line search, the geometric descent (GD)

9: method of Bubeck, Lee and Singh has the same bound for this class of functions.

10: The method of linear conjugate gradients (CG)

11: also satisfies the same

12: complexity bound in the special case of strongly convex quadratic functions,

13: but in this special case it can be faster than the AG and GD methods.

14:

15: Despite similarities in the algorithms and their

16: asymptotic convergence rates, the conventional analysis of the

17: running time of CG is mostly disjoint

18: from that of AG and GD.  The analyses of

19: the AG and GD methods are also rather distinct.

20:

21: Our main result is analyses of the three methods that share several

22: common threads: all three analyses show

23: a relationship to a certain ``idealized algorithm'', all three

24: establish the convergence rate

25: through the use of the Bubeck-Lee-Singh geometric lemma, and all three

26: have the same potential

27: that is computable at run-time and exhibits decrease

28: by a factor of $1-\sqrt{\ell/L}$ or better

29: per iteration.

30:

31: One application of these analyses is that they open the possibility of hybrid

32: or intermediate algorithms.  One such algorithm is proposed herein

33: and is shown to perform well in  computational tests.

34: \end{abstract}