abstract:19e3a5e9007351f3.tex

1: \begin{abstract}

2: Training neural ODEs on large datasets has not been tractable due to the

3: necessity of allowing the adaptive numerical ODE solver to refine its step size

4: to very small values. In practice this leads to dynamics equivalent to many

5: hundreds or even thousands of layers. In this paper, we overcome this apparent

6: difficulty by introducing a theoretically-grounded combination of both optimal

7: transport and stability regularizations which encourage neural ODEs to prefer

8: simpler dynamics out of all the dynamics that solve a problem well. Simpler

9: dynamics lead to faster convergence and to fewer discretizations of the solver,

10: considerably decreasing wall-clock time without loss in performance. Our

11: approach allows us to train neural ODE-based generative models to the same

12: performance as the unregularized dynamics, with significant reductions in

13: training time. This brings neural ODEs closer to practical relevance in

14: large-scale applications.

15: \end{abstract}

16: