3a8457d75827118a.tex
1: \begin{abstract}
2:  We consider the high-dimensional discriminant analysis problem. For
3:   this problem, different methods have been proposed and justified by
4:   establishing exact convergence rates for the classification risk, as
5:   well as the $\ell_2$ convergence results to the discriminative
6:   rule. However, sharp theoretical analysis for the variable selection
7:   performance of these procedures have not been established, even
8:   though model interpretation is of fundamental importance in
9:   scientific data analysis.  This paper bridges the gap by providing
10:   sharp sufficient conditions for consistent variable selection using
11:   the sparse discriminant analysis \citep{mai2012}. Through careful
12:   analysis, we establish rates of convergence that are significantly
13:   faster than the best known results and admit an optimal scaling of
14:   the sample size $n$, dimensionality $p$, and sparsity level $s$ in
15:   the high-dimensional setting.  Sufficient conditions are
16:   complemented by the necessary information theoretic limits on the
17:   variable selection problem in the context of high-dimensional
18:   discriminant analysis. Exploiting a numerical equivalence result,
19:   our method also establish the optimal results for the ROAD estimator
20:   \citep{fan2010road} and the sparse optimal scaling estimator
21:   \citep{clemmensen2011sparse}.  Furthermore, we analyze an exhaustive
22:   search procedure, whose performance serves as a benchmark, and show
23:   that it is variable selection consistent under weaker conditions.
24:   Extensive simulations demonstrating the sharpness of the bounds are
25:   also provided.
26: \end{abstract}
27: