abstract:dde2f0b7d1b10683.tex

1: \begin{abstract}%

2: 	Training deep neural networks (DNNs) can be difficult due to the occurrence of vanishing/exploding gradients during weight optimization.

3: 	To avoid this problem, we propose a class of DNNs stemming from the time discretization of Hamiltonian systems.

4: 	The time-invariant version of the corresponding Hamiltonian models enjoys marginal stability, a property that, as shown in previous works and for specific DNNs architectures, can mitigate convergence to zero or divergence of gradients.

5: 	In the present paper, we formally study this feature by deriving and analysing the backward gradient dynamics in continuous time.

6: 	The proposed Hamiltonian framework, besides encompassing existing networks inspired by marginally stable ODEs, allows one to derive new and more expressive architectures.

7: 	The good performance of the novel DNNs is demonstrated on benchmark classification problems, including digit recognition using the MNIST dataset.

8: 	\\

9: 	\textbf{Keywords: }	Deep Neural Networks, Dynamical Systems, Hamiltonian Systems, Gradient Dynamics.%

10: \end{abstract}

11: