1: \begin{abstract}
2: \noindent
3: Row-sparse principal component analysis (rsPCA), also known as principal component analysis (PCA) with global support, is the problem of finding the top-$r$ leading principal components such that all these principal components are linear combination of a subset of $k$ variables. rsPCA is a popular dimension reduction tool in statistics that enhances interpretability compared to regular principal component analysis (PCA). Popular methods for solving rsPCA mentioned in literature are either greedy heuristics (in the special case of $r = 1$) where guarantees on the quality of solution found can be verified under restrictive statistical-models, or algorithms with stationary point convergence guarantee for some regularized reformulation of rsPCA. There are no known good heuristics when $r >1$, and more importantly none of the existing computational methods can efficiently verify the quality of the solutions via comparing objective values of feasible solutions with dual bounds, especially in a statistical-model-free setting.
4:
5: We propose: (a) a convex integer programming relaxation of rsPCA that gives upper (dual) bounds for rsPCA, and; (b) a new local search algorithm for finding primal feasible solutions for rsPCA in the general case where $r >1$. We also show that, in the worst-case, the dual bounds provided by the convex IP is within an affine function of the global optimal value.
6: Numerical results are reported to demonstrate the advantages of our method.
7: \end{abstract}
8: