340c2071fb40590b.tex
1: \begin{abstract}
2: Principal component analysis (PCA) is widely used for
3: dimensionality reduction, with well-documented
4: merits in various applications involving high-dimensional
5: data, including computer vision, preference measurement, and
6: bioinformatics. In this context, the fresh look advocated here
7: permeates benefits from variable selection and compressive
8: sampling, to robustify PCA against outliers. A least-trimmed
9: squares estimator of a low-rank bilinear factor analysis model is
10: shown closely related to that obtained from an
11: $\ell_0$-(pseudo)norm-regularized criterion encouraging
12: \textit{sparsity} in a matrix explicitly modeling the outliers.
13: This connection suggests robust PCA schemes based
14: on convex relaxation, which lead naturally to a family of
15: robust estimators encompassing Huber's optimal M-class as a special case. 
16: Outliers are identified by tuning a regularization parameter, which
17: amounts to controlling sparsity of the outlier matrix along
18: the whole \textit{robustification} path of (group) least-absolute shrinkage
19: and selection operator (Lasso)
20: solutions. Beyond its neat ties to robust statistics, the developed 
21: outlier-aware PCA framework is versatile 
22:  to accommodate novel and scalable algorithms to: i) 
23: track the low-rank signal subspace robustly, as new data are acquired in real 
24: time; and ii) determine principal components robustly in (possibly)
25: infinite-dimensional feature spaces. Synthetic and real 
26: data tests corroborate the effectiveness of the proposed robust PCA schemes, 
27: when 
28: used to identify aberrant responses in personality assessment surveys, as well 
29: as 
30: unveil communities in social networks, and intruders from video surveillance 
31: data.
32: \end{abstract}