abstract:83a0243625e06f0d.tex

1: \begin{abstract}

2: This study focuses on a Wasserstein-type gradient flow, which represents an optimization process of a continuous model of a Deep Neural Network (DNN).

3: First, we establish the existence of a minimizer for an average loss of the model under $L^2$-regularization.

4: Subsequently, we show the existence of a curve of maximal slope of the loss.

5: Our main result is the convergence of flow to a critical point of the loss as time goes to infinity.

6: An essential aspect of proving this result involves the establishment of the  \L{}ojasiewicz--Simon gradient inequality for the loss.

7: We derive this inequality by assuming the analyticity of NNs and loss functions.

8: Our proofs offer a new approach for analyzing the asymptotic behavior of Wasserstein-type gradient flows for nonconvex functionals.

9: \end{abstract}

10: