abstract:340c2071fb40590b.tex

1: \begin{abstract}

2: Principal component analysis (PCA) is widely used for

3: dimensionality reduction, with well-documented

4: merits in various applications involving high-dimensional

5: data, including computer vision, preference measurement, and

6: bioinformatics. In this context, the fresh look advocated here

7: permeates benefits from variable selection and compressive

8: sampling, to robustify PCA against outliers. A least-trimmed

9: squares estimator of a low-rank bilinear factor analysis model is

10: shown closely related to that obtained from an

11: $\ell_0$-(pseudo)norm-regularized criterion encouraging

12: \textit{sparsity} in a matrix explicitly modeling the outliers.

13: This connection suggests robust PCA schemes based

14: on convex relaxation, which lead naturally to a family of

15: robust estimators encompassing Huber's optimal M-class as a special case.

16: Outliers are identified by tuning a regularization parameter, which

17: amounts to controlling sparsity of the outlier matrix along

18: the whole \textit{robustification} path of (group) least-absolute shrinkage

19: and selection operator (Lasso)

20: solutions. Beyond its neat ties to robust statistics, the developed

21: outlier-aware PCA framework is versatile

22:  to accommodate novel and scalable algorithms to: i)

23: track the low-rank signal subspace robustly, as new data are acquired in real

24: time; and ii) determine principal components robustly in (possibly)

25: infinite-dimensional feature spaces. Synthetic and real

26: data tests corroborate the effectiveness of the proposed robust PCA schemes,

27: when

28: used to identify aberrant responses in personality assessment surveys, as well

29: as

30: unveil communities in social networks, and intruders from video surveillance

31: data.

32: \end{abstract}