1: \begin{abstract}
2: This paper is concerned with screening features in ultrahigh dimensional data analysis, which
3: has become increasingly important in diverse scientific fields.
4: We develop a sure independence screening procedure based on the
5: distance correlation (DC-SIS, for short).
6: %The distance correlation was proposed by \cite{Szekely:Rizzo:Bakirov:2007}
7: %for measuring dependence between two random vectors.
8: The DC-SIS can be implemented as easily as the
9: sure independence screening procedure based on the Pearson correlation (SIS, for short)
10: proposed by \cite{Fan:Lv:2008}.
11: However, the DC-SIS can significantly improve the SIS. \cite{Fan:Lv:2008} established the sure screening
12: property for the SIS based on linear models, but the sure screening property is valid
13: for the DC-SIS under more general settings including linear models.
14: Furthermore, the implementation of the DC-SIS does not require model specification (e.g., linear model or
15: generalized linear model) for responses or predictors. This is a very appealing
16: property in ultrahigh dimensional data analysis. Moreover, the DC-SIS can be used directly to screen
17: grouped predictor variables and for multivariate response variables. We establish the sure screening property
18: for the DC-SIS, and conduct simulations to examine its finite sample performance.
19: Numerical comparison indicates that the DC-SIS performs much better than the SIS in various
20: models. We also illustrate the DC-SIS through a real data example.
21: \end{abstract}
22: