eb2782c478ea0561.tex
1: \begin{abstract}
2:     We present a strikingly simple proof that two rules are sufficient
3:     to automate gradient descent: 1)~don't increase the stepsize too
4:     fast and 2)~don't overstep the local curvature. No need for
5:     functional values, no line search, no information about the
6:     function except for the gradients. By following these rules, you
7:     get a method adaptive to the local geometry, with convergence
8:     guarantees depending only on the smoothness in a neighborhood of a
9:     solution. Given that the problem is convex, our method
10:     converges even if the global smoothness constant is infinity. As an
11:     illustration, it can minimize arbitrary continuously
12:     twice-differentiable convex function. We examine its performance
13:     on a range of convex and nonconvex problems, including logistic
14:     regression and matrix factorization.
15: \end{abstract}
16: