e2b8c77d0827ffe7.tex
1: \begin{abstract}
2: We obtain estimation error rates for estimators obtained by aggregation of regularized median-of-means tests, following a construction of Le Cam. The results hold with exponentially large probability,
3: % -- as in the gaussian framework with independent noise-- 
4:  under only weak moments assumptions on data.
5: %  and without assuming independence between noise and design.
6: Any norm may be used for regularization. When it has some sparsity inducing power we recover sparse rates of convergence. 
7: %
8: The procedure is robust since a large part of data may be corrupted, these outliers have nothing to do with the oracle we want to reconstruct. Our general risk bound is of order
9: \begin{equation*}
10: \max\left(\mbox{minimax rate in the i.i.d. setup}, \frac{\mbox{number of outliers}}{\mbox{number of observations}}\right)  \enspace.
11: \end{equation*}In particular, the number of outliers  may be as large as \textit{(number of data) $\times$(minimax rate)} without affecting this rate. The other data do not have to be identically distributed but should only have equivalent $L^1$ and $L^2$ moments.
12: %
13: %
14: For example, the minimax rate $s \log(ed/s)/N$ of recovery of a $s$-sparse vector in $\R^d$ is achieved with exponentially large probability by a median-of-means version of the LASSO when the noise has $q_0$ moments for some $q_0>2$, the entries of the design matrix should have  $C_0\log(ed)$ moments and the dataset can be corrupted up to $C_1 s \log(ed/s)$ outliers.
15: % and the result holds . 
16: %This result holds with exponentially large probability as if the noise and the design were i.i.d. Gaussian random variables. 
17: \end{abstract}
18: