1: \begin{abstract}
2: We consider the computation by simulation and neural net regression of conditional expectations, or more general elicitable statistics, of functionals of
3: processes $(\theX,\theI)$.
4: Here an exogenous component $\theI$ (Markov by itself) is time-consuming to simulate, while the endogenous component $\theX$ (jointly Markov with $\theI$) is
5: quick to simulate given $\theI$, but is responsible for most of the variance of the simulated payoff.
6: To address the related variance issue,
7: we introduce a conditionally independent, hierarchical simulation scheme, where several paths of $X$ are simulated for each simulated path of $Y$.
8: We analyze the statistical convergence of the regression learning scheme based on such block-dependent data.
9: We derive heuristics on the number of paths of $Y$ and, for each of them, of $X$, that should be simulated.
10: The resulting algorithm is implemented on a graphics processing unit (GPU) combining Python/CUDA and learning with PyTorch.
11: A CVA case study with a nested Monte Carlo benchmark shows that the \oversimulation technique is key to the success of the learning approach.
12: \end{abstract}
13: