1: \begin{abstract}
2:
3: Motivated by recent applications in min-max optimization, we employ tools from nonlinear control theory in order to analyze a class of ``historical'' gradient-based methods, for which the next step lies in the span of the previously observed gradients within a time horizon. Specifically, we leverage techniques developed by Hu and Lessard (2017) to build a frequency-domain framework which reduces the analysis of such methods to numerically-solvable algebraic tasks, establishing linear convergence under a class of strongly monotone and co-coercive operators.
4:
5: On the applications' side, we focus on the Optimistic Gradient Descent (OGD) method, which augments the standard Gradient Descent with an additional past-gradient in the optimization step. The proposed framework leads to a simple and sharp analysis of OGD---and generalizations thereof---under a broad regime of parameters. Notably, this characterization directly extends under adversarial noise in the observed value of the gradient. Moreover, our frequency-domain framework provides an exact quantitative comparison between simultaneous and alternating updates of OGD. An interesting byproduct is that OGD---and variants thereof---is an instance of PID control, arguably one of the most influential algorithms of the last century; this observation sheds more light to the stabilizing properties of ``optimism''.
6: \end{abstract}
7: