1: \begin{abstract}
2: We develop a theory to elucidate the reason that deep neural networks (DNNs) perform better than other methods.
3: In terms of the nonparametric regression problem, it is well known that many standard methods attain the minimax optimal rate of estimation errors for smooth functions, and thus, it is not straightforward to identify the theoretical advantages of DNNs.
4: This study fills this gap by considering the estimation for a class of non-smooth functions with singularities on smooth curves.
5: Our findings are as follows: (i) We derive the generalization error of a DNN estimator and prove that its convergence rate is almost optimal. (ii) We reveal that a certain class of common models are sub-optimal, including linear estimators and other harmonic analysis methods such as wavelets and curvelets.
6: This advantage of DNNs comes from a fact that a shape of singularity can be successfully handled by their multi-layered structure.
7: \end{abstract}
8: