abstract:4d71891e983bc353.tex

1: \begin{abstract}

2: In this work, we propose a stochastic second-order method in the framework of a non-monotone trust-region approach for solving unconstrained nonlinear and non-convex optimization problems arising in  the training of deep neural networks.  We apply subsampling strategies which yield noisy approximations of the finite sum objective function and its gradient. Our approach involves additional sampling in order to control the resulting approximation error, i.e., to construct an adaptive sample size strategy. Depending on the estimated progress of the algorithm, this can yield sample size scenarios ranging from mini-batch to full sample functions. We provide convergence analysis for all possible scenarios and show that the proposed method achieves almost sure convergence under standard assumptions for the trust-region framework. Our numerical results show that the proposed algorithm outperforms its state-of-the-art counterpart in the training of deep neural networks for image classification and regression tasks.

3: %the first-order model STORM, a well-established adaptive second-order stochastic TR algorithm.

4: \end{abstract}

5: