abstract:99f27feccc64a512.tex

1: \begin{abstract}

2: We propose a new algorithm for finite sum optimization

3:      which we call the curvature-aided incremental aggregated

4:      gradient ({\sf CIAG}) method.

5:      Motivated by the problem of training a classifier for a $d$-dimensional problem,

6:      where the number of training data is $m$ and $m \gg d \gg 1$,

7:      the {\sf CIAG} method seeks to accelerate

8:      incremental aggregated gradient ({\sf IAG}) methods using aids from the

9:      curvature (or Hessian) information, while avoiding the

10:      evaluation of matrix inverses required by the

11:      incremental Newton ({\sf IN}) method.

12:      Specifically, our idea is to exploit

13:      the incrementally aggregated Hessian matrix to trace the

14:      full gradient vector at every incremental step, therefore

15:      achieving an improved linear convergence rate over the

16:      state-of-the-art {\sf IAG} methods.

17:      For strongly convex problems,

18:      the fast linear convergence rate requires the objective function

19:      to be close to quadratic, or the initial point to be close to optimal solution.

20:      Importantly, we show that running \emph{one} iteration of

21:      the {\sf CIAG} method yields the same improvement to the optimality gap

22:      as running one iteration of the \emph{full gradient} method,

23:      while the complexity is ${\cal O}(d^2)$ for {\sf CIAG} and ${\cal O}(md)$

24:      for the full gradient.

25:      Overall, the {\sf CIAG} method strikes a balance

26:      between the high computation complexity

27:      incremental Newton-type methods and the slow {\sf IAG}

28:      method.

29:      Our numerical results support the theoretical findings and

30:      show that the {\sf CIAG} method

31:      often converges with much fewer iterations than {\sf IAG},

32:      and requires much shorter running time than {\sf IN} when the problem

33:      dimension is high.\end{abstract}

34: