1: \begin{abstract}
2: \noindent
3: Modern Monte Carlo-type approaches
4: to dynamic decision problems
5: face the classical bias-variance trade-off.
6: Deep neural networks can
7: overlearn the data and construct
8: feedback actions which are non-adapted to
9: the information flow and hence,
10: become susceptible to generalization error.
11: We prove
12: asymptotic overlearning for fixed training sets, but also
13: provide a non-asymptotic upper bound on overperformance
14: based on the Rademacher complexity
15: demonstrating the convergence of these algorithms
16: for sufficiently large training sets.
17: Numerically studied
18: stylized examples illustrate these possibilities, the dependence on the
19: dimension and the effectiveness of this approach.
20: \end{abstract}
21: