0ea1c13df30ac53c.tex
1: \begin{abstract}
2:     We give a general result concerning the rates of convergence of
3:     penalized empirical risk minimizers (PERM) in the regression
4:     model. Then, we consider the problem of agnostic learning of the
5:     regression, and give in this context an oracle inequality and a
6:     lower bound for PERM over a finite class. These results hold for a
7:     general multivariate random design, the only assumption being the
8:     compactness of the support of its law (allowing discrete
9:     distributions for instance). Then, using these results, we
10:     construct adaptive estimators. We consider as examples adaptive
11:     estimation over anisotropic Besov spaces or reproductive kernel
12:     Hilbert spaces. Finally, we provide an empirical evidence that
13:     aggregation leads to more stable estimators than more standard
14:     cross-validation or generalized cross-validation methods for the
15:     selection of the smoothing parameter, when the number of
16:     observation is small.
17:     % estimators which are Our aggregation
18:     % approach is motivated by a lower bound for PERM procedures over
19:     % a finite set of weak estimators, which proves that PERM
20:     % procedures are suboptimal compared to some exponential weighted
21:     % averaged schemes.
22:     % We propose an adaptive estimator of the multivariate regression
23:     % function $f_0$ from i.i.d. observations. Without assumption on
24:     % the law $P_X$ of the covariates, besides almost sure
25:     % boundedness, we prove that the standard rate $n^{-s / (2s + 1)}$
26:     % can be achieved by an adaptive estimator, where $n$ denotes the
27:     % sample size and $s$ the smoothness of $f_0$ measured in some
28:     % sense, including Besov smoothness. The assumption on the noise
29:     % is fairly general.
30:   \end{abstract}
31: