1: \begin{abstract}
2: Machine Learning (ML) has become a promising tool for improving the
3: quality of atomistic simulations. Using formaldehyde as a benchmark
4: system for intramolecular interactions, a comparative assessment of ML
5: models based on state-of-the-art variants of deep neural networks
6: (NN), reproducing kernel Hilbert space (RKHS+F), and kernel ridge
7: regression (KRR) is presented. Learning curves for energies and atomic
8: forces indicate rapid convergence towards excellent predictions for
9: B3LYP, MP2, and CCSD(T)-F12 reference results for modestly sized (in
10: the hundreds) training sets. Typically, learning curve off-sets decay
11: as one goes from NN (PhysNet) to RKHS+F to KRR (FCHL). Conversely, the
12: predictive power for extrapolation of energies towards new geometries
13: increases in the same order with RKHS+F and FCHL performing almost
14: equally. For harmonic vibrational frequencies, the picture is less
15: clear, with PhysNet and FCHL yielding respectively flat learning at
16: $\sim 1$ and $\sim 0.2$ cm$^{-1}$ no matter which reference method,
17: while RKHS+F models level off for B3LYP, and exhibit continued
18: improvements for MP2 and CCSD(T)-F12. Finite-temperature molecular
19: dynamics (MD) simulations with the same initial conditions yield
20: indistinguishable infrared spectra with good performance compared with
21: experiment except for the high-frequency modes involving hydrogen
22: stretch motion which is a known limitation of MD for vibrational
23: spectroscopy. For sufficiently large training set sizes all three
24: models can detect insufficient convergence (``noise'') of the
25: reference electronic structure calculations in that the learning
26: curves level off. Transfer learning (TL) from B3LYP to CCSD(T)-F12
27: with PhysNet indicates that additional improvements in data efficiency
28: can be achieved.
29: \end{abstract}
30: