369835fdb2e417f8.tex
1: \begin{abstract}
2: Deep learning has shown high performances in various types of tasks from visual recognition to natural language processing,
3: which indicates superior flexibility and adaptivity of deep learning.
4: To understand this phenomenon theoretically,
5: we develop a new approximation and estimation error analysis of 
6: deep learning with the ReLU activation for functions 
7: in a Besov space and its variant with mixed smoothness.
8: The Besov space is a considerably general function space including the \Holder space and Sobolev space,
9: and especially can capture spatial inhomogeneity of smoothness.
10: Through the analysis in the Besov space,  
11: it is shown that deep learning can achieve the minimax optimal rate and 
12: outperform 
13: any non-adaptive (linear) estimator such as kernel ridge regression,
14: which shows that deep learning has higher adaptivity to 
15: the spatial inhomogeneity of the target function than 
16: other estimators such as linear ones.
17: In addition to this, 
18: it is shown that deep learning can avoid 
19: the curse of dimensionality
20: if the target function is in a {\it mixed smooth} Besov space.
21: We also show that the dependency of the convergence rate on the dimensionality is tight
22: due to its minimax optimality. 
23: These results support high adaptivity of deep learning and its superior ability as a feature extractor.
24: %from a theoretical view point.
25: %Our theoretical results support importance of deep neural network's ability as a feature extractor. 
26: \end{abstract}