abstract:ec578a36f0804b35.tex

1: \begin{abstract}

2: We propose a novel training method based on nonlinear multilevel minimization techniques, commonly used for solving discretized large scale partial differential equations.

3: Our multilevel training method constructs a multilevel hierarchy by reducing the number of samples.

4: The training of the original model is then enhanced by internally training surrogate models constructed with fewer samples.

5: We construct the surrogate models using first-order consistency approach.

6: This gives rise to surrogate models, whose gradients are stochastic estimators of the full gradient, but with reduced variance compared to standard stochastic gradient estimators.

7: We illustrate the convergence behavior of the proposed multilevel method to machine learning applications based on logistic regression.

8: A comparison with subsampled Newton's and variance reduction methods demonstrate the efficiency of our multilevel method.

9: \end{abstract}

10: