3896db7b1c51f9e2.tex
1: \begin{abstract}
2: The connection between training deep neural networks (DNNs) and optimal control theory (OCT) has attracted considerable attention as a principled tool of algorithmic design.
3: Despite few attempts being made,
4: they have been limited to architectures where the layer propagation resembles a {Markovian dynamical system}.
5: This casts doubts on their flexibility
6: to modern networks that heavily rely on {non-Markovian} dependencies between layers
7: (\eg skip connections in residual networks).
8: In this work, we propose a novel dynamic game perspective
9: by viewing each \emph{layer} as a \emph{player} in a dynamic game characterized by the DNN itself.
10: Through this lens,
11: different classes of optimizers can be seen as matching different types of {Nash equilibria}, %
12: depending on the implicit information structure {of} each (p)layer.
13: The resulting method, called Dynamic Game Theoretic Neural Optimizer (DGNOpt),
14: not only generalizes OCT-inspired optimizers to richer network class;
15: it
16: also {motivates} a new training principle by solving a multi-player cooperative game.
17: DGNOpt shows convergence improvements over existing methods on
18: image classification datasets with residual and inception networks.
19: Our work marries strengths from both OCT and game theory, paving ways to new algorithmic opportunities from robust optimal control and {bandit-based optimization}.
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
33: 
34: 
35: 
36: 
37: 
38: 
39: 
40: 
41: 
42: 
43: \end{abstract}
44: