abstract:4883d45999664668.tex

1: \begin{abstract}

2: In this work, we propose a stochastic second-order method in the framework of a non-monotone trust-region approach for solving unconstrained nonlinear and non-convex optimization problems arising in  the training of deep neural networks. {\color{blue} We  apply subsampling strategies which yield noisy approximations of the finite sum objective function and its gradient.  Our approach also involves additional sampling in order to control the resulting approximation error, i.e., to construct an adaptive sample size strategy. Depending on the estimated progress of the algorithm, this can yield sample size scenarios  ranging from  mini-batch to  full sample functions. We provide convergence analysis for all possible scenarios and show that the proposed method achieves almost sure convergence under  standard assumptions for the trust-region framework. }

3: We provide numerical results showing the performance of the proposed optimizer in the training of deep residual networks for image classification and regression tasks.

4: Our numerical results show that the proposed algorithm

5: outperforms its state-of-the-art counterpart on the considered problems.

6: %the first-order model STORM, a well-established adaptive second-order stochastic TR algorithm.

7: \end{abstract}

8: