d3be2d950e46f586.tex
1: \begin{abstract}
2: In convex optimization, there is an {\em acceleration} phenomenon in which we can boost the convergence rate of certain gradient-based algorithms.
3: We can observe this phenomenon in Nesterov's accelerated gradient descent, accelerated mirror descent, and accelerated cubic-regularized Newton's method, among others.
4: %We study acceleration from the vantage point of the parallel structure of discrete and continuous time optimization.
5: In this paper, we show that the family of higher-order gradient methods in discrete time (generalizing gradient descent) corresponds to a family of first-order rescaled gradient flows in continuous time. On the other hand, the family of {\em accelerated} higher-order gradient methods (generalizing accelerated mirror descent) corresponds to a family of second-order differential equations in continuous time, each of which is the Euler-Lagrange equation of a family of Lagrangian functionals.
6: We also study the exponential variant of the Nesterov Lagrangian, which corresponds to a generalization of Nesterov's restart scheme and achieves a linear rate of convergence in discrete time.
7: Finally, we show that the family of Lagrangians is closed under time dilation (an orbit under the action of speeding up time), which demonstrates the universality of this Lagrangian view of acceleration in optimization. 
8: %Therefore, we can interpret the family of accelerated higher-order gradient methods as the result of speeding up Nesterov's accelerated gradient descent.
9: \end{abstract}
10: