abstract:ef2da4c66ac75aee.tex

1: \begin{abstract}

2: The note considers \emph{normalized gradient descent} (NGD), a natural modification of classical gradient descent (GD) in optimization problems. A serious shortcoming of GD in non-convex problems is that GD may take arbitrarily long to escape from the neighborhood of a saddle point. This issue can make the convergence of GD arbitrarily slow, particularly in high-dimensional non-convex problems where the relative number of saddle points is often large.

3: The paper focuses on continuous-time descent. It is shown that, contrary to standard GD, NGD escapes saddle points ``quickly.'' In particular, it is shown that (i) NGD ``almost never'' converges to saddle points and (ii) the time required for NGD to escape from a ball of radius $r$ about a saddle point $x^*$ is at most $5\sqrt{\kappa}r$, where $\kappa$ is the condition number of the Hessian of $f$ at $x^*$. As an application of this result, a global convergence-time bound is established for NGD under mild assumptions.

4: \end{abstract}

5: