1: \begin{abstract}
2: Deep learning has shown high performances in various types of tasks from visual recognition to natural language processing,
3: which indicates superior flexibility and adaptivity of deep learning.
4: To understand this phenomenon theoretically,
5: we develop a new approximation and estimation error analysis of
6: deep learning with the ReLU activation for functions
7: in a Besov space and its variant with mixed smoothness.
8: The Besov space is a considerably general function space including the \Holder space and Sobolev space,
9: and especially can capture spatial inhomogeneity of smoothness.
10: Through the analysis in the Besov space,
11: it is shown that deep learning can achieve the minimax optimal rate and
12: outperform
13: any non-adaptive (linear) estimator such as kernel ridge regression,
14: which shows that deep learning has higher adaptivity to
15: the spatial inhomogeneity of the target function than
16: other estimators such as linear ones.
17: In addition to this,
18: it is shown that deep learning can avoid
19: the curse of dimensionality
20: if the target function is in a {\it mixed smooth} Besov space.
21: We also show that the dependency of the convergence rate on the dimensionality is tight
22: due to its minimax optimality.
23: These results support high adaptivity of deep learning and its superior ability as a feature extractor.
24: %from a theoretical view point.
25: %Our theoretical results support importance of deep neural network's ability as a feature extractor.
26: \end{abstract}