abstract:8ea6d6aa99ea78fe.tex

1: \begin{abstract}

2:   \emph{Anderson acceleration} (or Anderson mixing) is an efficient acceleration method for fixed point iterations $x_{t+1}=G(x_t)$, e.g., gradient descent can be viewed as iteratively applying the operation $G(x) \triangleq x-\alpha\nabla f(x)$.

3:   It is known that Anderson acceleration is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems.

4:   In this paper, we show that Anderson acceleration with Chebyshev polynomial can achieve the optimal convergence rate $O(\sqrt{\kappa}\ln\frac{1}{\epsilon})$, which improves the previous result $O(\kappa\ln\frac{1}{\epsilon})$ provided by \citep{toth2015convergence} for quadratic functions.

5:   Moreover, we provide a convergence analysis for minimizing general nonlinear problems.

6:   Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter $L$) are not available, we propose a \emph{guessing algorithm} for guessing them dynamically and also prove a similar convergence rate.

7:   Finally, the experimental results demonstrate that the proposed Anderson-Chebyshev acceleration method converges significantly faster than other algorithms, e.g., vanilla gradient descent (GD), Nesterov's Accelerated GD.

8:   Also, these algorithms combined with the proposed guessing algorithm (guessing the hyperparameters dynamically) achieve much better performance.

9: \end{abstract}

10: