1: \begin{abstract}
2: \emph{Anderson acceleration} (or Anderson mixing) is an efficient acceleration method for fixed point iterations $x_{t+1}=G(x_t)$, e.g., gradient descent can be viewed as iteratively applying the operation $G(x) \triangleq x-\alpha\nabla f(x)$.
3: It is known that Anderson acceleration is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems.
4: In this paper, we show that Anderson acceleration with Chebyshev polynomial can achieve the optimal convergence rate $O(\sqrt{\kappa}\ln\frac{1}{\epsilon})$, which improves the previous result $O(\kappa\ln\frac{1}{\epsilon})$ provided by \citep{toth2015convergence} for quadratic functions.
5: Moreover, we provide a convergence analysis for minimizing general nonlinear problems.
6: Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter $L$) are not available, we propose a \emph{guessing algorithm} for guessing them dynamically and also prove a similar convergence rate.
7: Finally, the experimental results demonstrate that the proposed Anderson-Chebyshev acceleration method converges significantly faster than other algorithms, e.g., vanilla gradient descent (GD), Nesterov's Accelerated GD.
8: Also, these algorithms combined with the proposed guessing algorithm (guessing the hyperparameters dynamically) achieve much better performance.
9: \end{abstract}
10: