abstract:cd5c4644918b34d7.tex

1: \begin{abstract}

2: In this paper we study stochastic quasi-Newton methods for nonconvex stochastic optimization, where we assume that noisy information about the gradients of the objective function is available via a stochastic first-order oracle ($\SFO$). We propose a general framework for such methods, for which we prove almost sure convergence to stationary points and analyze its worst-case iteration complexity. When a randomly chosen iterate is returned as the output of such an algorithm, we prove that in the worst-case, the $\SFO$-calls complexity is $O(\epsilon^{-2})$ to ensure that the expectation of the squared norm of the gradient is smaller than the given accuracy tolerance $\epsilon$. We also propose a specific algorithm, namely a stochastic damped L-BFGS (SdLBFGS) method, that falls under the proposed framework. {Moreover, we incorporate the SVRG variance reduction technique into the proposed SdLBFGS method, and analyze its $\SFO$-calls complexity. Numerical results on a nonconvex binary classification problem using SVM, and a multiclass classification problem using neural networks are reported.}

3:

4: \vspace{0.8cm}

5:

6: \noindent {\bf Keywords:} {Nonconvex Stochastic Optimization, Stochastic Approximation, Quasi-Newton Method, Damped L-BFGS Method, Variance Reduction}

7:

8: \vspace{0.5cm}

9:

10: \noindent {\bf Mathematics Subject Classification 2010:} 90C15; 90C30; 62L20; 90C60

11:

12: \end{abstract}

13: