abstract:aa1be49e9e77e62b.tex

1: \begin{abstract}

2: Sparse principal component analysis (PCA) is a popular tool for dimensional reduction of high-dimensional data. Despite its massive popularity, there is still a lack of theoretically justifiable Bayesian sparse PCA that is computationally scalable.

3: A major challenge is choosing a suitable prior for the loadings matrix, as principal components are mutually orthogonal.

4: We propose a spike and slab prior that meets this orthogonality constraint and show that the posterior enjoys both theoretical and computational advantages.

5: Two computational algorithms, the PX-CAVI and the PX-EM algorithms, are developed. Both algorithms use parameter expansion to deal with the orthogonality constraint and to accelerate their convergence speeds.

6: We found that the PX-CAVI algorithm has superior empirical performance than the PX-EM algorithm and two other penalty methods for sparse PCA.

7: The PX-CAVI algorithm is then applied to study a lung cancer gene expression dataset.

8: $\mathsf{R}$ package $\mathsf{VBsparsePCA}$ with an implementation of the algorithm is available on The Comprehensive R Archive Network.

9: \end{abstract}

10: