1: \begin{abstract}
2: The connection between training deep neural networks (DNNs) and optimal control theory (OCT) has attracted considerable attention as a principled tool of algorithmic design.
3: Despite few attempts being made,
4: they have been limited to architectures where the layer propagation resembles a {Markovian dynamical system}.
5: This casts doubts on their flexibility
6: to modern networks that heavily rely on {non-Markovian} dependencies between layers
7: (\eg skip connections in residual networks).
8: In this work, we propose a novel dynamic game perspective
9: by viewing each \emph{layer} as a \emph{player} in a dynamic game characterized by the DNN itself.
10: Through this lens,
11: different classes of optimizers can be seen as matching different types of {Nash equilibria}, %
12: depending on the implicit information structure {of} each (p)layer.
13: The resulting method, called Dynamic Game Theoretic Neural Optimizer (DGNOpt),
14: not only generalizes OCT-inspired optimizers to richer network class;
15: it
16: also {motivates} a new training principle by solving a multi-player cooperative game.
17: DGNOpt shows convergence improvements over existing methods on
18: image classification datasets with residual and inception networks.
19: Our work marries strengths from both OCT and game theory, paving ways to new algorithmic opportunities from robust optimal control and {bandit-based optimization}.
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43: \end{abstract}
44: