abstract:3896db7b1c51f9e2.tex

1: \begin{abstract}

2: The connection between training deep neural networks (DNNs) and optimal control theory (OCT) has attracted considerable attention as a principled tool of algorithmic design.

3: Despite few attempts being made,

4: they have been limited to architectures where the layer propagation resembles a {Markovian dynamical system}.

5: This casts doubts on their flexibility

6: to modern networks that heavily rely on {non-Markovian} dependencies between layers

7: (\eg skip connections in residual networks).

8: In this work, we propose a novel dynamic game perspective

9: by viewing each \emph{layer} as a \emph{player} in a dynamic game characterized by the DNN itself.

10: Through this lens,

11: different classes of optimizers can be seen as matching different types of {Nash equilibria}, %

12: depending on the implicit information structure {of} each (p)layer.

13: The resulting method, called Dynamic Game Theoretic Neural Optimizer (DGNOpt),

14: not only generalizes OCT-inspired optimizers to richer network class;

15: it

16: also {motivates} a new training principle by solving a multi-player cooperative game.

17: DGNOpt shows convergence improvements over existing methods on

18: image classification datasets with residual and inception networks.

19: Our work marries strengths from both OCT and game theory, paving ways to new algorithmic opportunities from robust optimal control and {bandit-based optimization}.

20:

21:

22:

23:

24:

25:

26:

27:

28:

29:

30:

31:

32:

33:

34:

35:

36:

37:

38:

39:

40:

41:

42:

43: \end{abstract}

44: