abstract:de53cf9c0b2e2e20.tex

1: \begin{abstract}

2: We study high-dimensional estimators with the trimmed $\ell_1$ penalty,

3: which leaves the $h$ largest parameter entries penalty-free.

4: While optimization techniques for this nonconvex penalty have been studied, the statistical properties have not yet been analyzed.

5: We present the first statistical analyses for $M$-estimation,

6: and characterize support recovery, $\ell_\infty$ and $\ell_2$ error of the trimmed $\ell_1$ estimates as a function of the trimming parameter $h$.

7: Our results show different regimes based on how $h$ compares to the true support size.

8: Our second contribution is a new algorithm for the trimmed regularization problem,

9: which has the same theoretical convergence rate as difference of convex (DC) algorithms,

10: but in practice is faster and finds lower objective values. Empirical evaluation of $\ell_1$ trimming for sparse linear regression and graphical model estimation indicate that trimmed $\ell_1$ can outperform vanilla $\ell_1$ and non-convex alternatives.

11: Our last contribution is to show that the trimmed penalty is beneficial beyond $M$-estimation, and yields promising results for two deep learning tasks: input structures recovery and network sparsification.

12: \end{abstract}

13: