1: \begin{abstract}
2: We present a strikingly simple proof that two rules are sufficient
3: to automate gradient descent: 1)~don't increase the stepsize too
4: fast and 2)~don't overstep the local curvature. No need for
5: functional values, no line search, no information about the
6: function except for the gradients. By following these rules, you
7: get a method adaptive to the local geometry, with convergence
8: guarantees depending only on the smoothness in a neighborhood of a
9: solution. Given that the problem is convex, our method
10: converges even if the global smoothness constant is infinity. As an
11: illustration, it can minimize arbitrary continuously
12: twice-differentiable convex function. We examine its performance
13: on a range of convex and nonconvex problems, including logistic
14: regression and matrix factorization.
15: \end{abstract}
16: