adb52f0922ef4cc7.tex
1: \begin{abstract}
2: %Escaping from saddle points and finding local minimum is a central problem in nonconvex optimization. In this paper, we propose a novel algorithm named stochastic adaptive recursive gradient descent (SATURN) for finding local minima.  We show that SATURN can find an $O(\epsilon, \epsilon_{H})$-approximate local minima within $\tilde O(\epsilon^{-3} + \epsilon_{H}^{-6})$ stochastic gradient evaluations. Our algorithm is based upon a stochastic recursive gradient and a new adaptive learning rate schedule. To the best of our knowledge, SATURN is the first pure stochastic gradient-based local minima finding algorithm with $\tilde O(\epsilon^{-3}+\epsilon_{H}^{-6})$ gradient complexity, without accessing a negative-curvature search subroutine.
3: 
4: %\dongruo{remember to change the name in abstract}
5: 
6: Escaping from saddle points and finding local minimum is a central problem in nonconvex optimization.
7: Perturbed gradient methods are perhaps the simplest approach for this problem. However, to find $(\epsilon, \sqrt{\epsilon})$-approximate local minima, the existing best stochastic gradient complexity for this type of algorithms is $\tilde O(\epsilon^{-3.5})$, which is not optimal.
8: In this paper, we propose \texttt{LENA} (\textbf{L}ast st\textbf{E}p shri\textbf{N}k\textbf{A}ge), a faster perturbed stochastic gradient framework for finding local minima.  We show that $\algname$ with stochastic gradient estimators such as SARAH/SPIDER and STORM can find $(\epsilon, \epsilon_{H})$-approximate local minima within $\tilde O(\epsilon^{-3} + \epsilon_{H}^{-6})$ stochastic gradient evaluations (or $\tilde O(\epsilon^{-3})$ when $\epsilon_H = \sqrt{\epsilon}$). 
9: The core idea of our framework is a step-size shrinkage scheme to control the average movement of the iterates, which leads to faster convergence to the local minima. %This is new and of independent interest. 
10: 
11: 
12: 
13: 
14: \end{abstract}
15: