1: \begin{abstract}
2: We present a new multilevel minimization framework for the training of deep residual networks (ResNets), which has the potential to significantly reduce training time and effort.
3: Our framework is based on the dynamical system's viewpoint, which formulates a ResNet as the discretization of an initial value problem.
4: The training process is then formulated as a time-dependent optimal control problem, which we discretize using different time-discretization parameters,
5: eventually generating multilevel-hierarchy of auxiliary networks with different resolutions.
6: The training of the original ResNet is then enhanced by training the auxiliary networks with reduced resolutions.
7: By design, our framework is conveniently independent of the choice of the training strategy chosen on each level of the multilevel hierarchy.
8: By means of numerical examples, we analyze the convergence behavior of the proposed method and demonstrate its robustness.
9: For our examples we employ a multilevel gradient-based methods.
10: Comparisons with standard single level methods show a speedup of more than factor three while achieving the same validation accuracy.
11: \end{abstract}
12: