99f27feccc64a512.tex
1: \begin{abstract}
2: We propose a new algorithm for finite sum optimization
3:      which we call the curvature-aided incremental aggregated
4:      gradient ({\sf CIAG}) method. 
5:      Motivated by the problem of training a classifier for a $d$-dimensional problem, 
6:      where the number of training data is $m$ and $m \gg d \gg 1$, 
7:      the {\sf CIAG} method seeks to accelerate
8:      incremental aggregated gradient ({\sf IAG}) methods using aids from the
9:      curvature (or Hessian) information, while avoiding the
10:      evaluation of matrix inverses required by the
11:      incremental Newton ({\sf IN}) method. 
12:      Specifically, our idea is to exploit
13:      the incrementally aggregated Hessian matrix to trace the
14:      full gradient vector at every incremental step, therefore
15:      achieving an improved linear convergence rate over the
16:      state-of-the-art {\sf IAG} methods. 
17:      For strongly convex problems, 
18:      the fast linear convergence rate requires the objective function
19:      to be close to quadratic, or the initial point to be close to optimal solution. 
20:      Importantly, we show that running \emph{one} iteration of
21:      the {\sf CIAG} method yields the same improvement to the optimality gap
22:      as running one iteration of the \emph{full gradient} method,
23:      while the complexity is ${\cal O}(d^2)$ for {\sf CIAG} and ${\cal O}(md)$ 
24:      for the full gradient.
25:      Overall, the {\sf CIAG} method strikes a balance
26:      between the high computation complexity
27:      incremental Newton-type methods and the slow {\sf IAG}
28:      method. 
29:      Our numerical results support the theoretical findings and 
30:      show that the {\sf CIAG} method
31:      often converges with much fewer iterations than {\sf IAG},
32:      and requires much shorter running time than {\sf IN} when the problem
33:      dimension is high.\end{abstract}
34: