abstract:99ff050e10915ec8.tex

1: \begin{abstract}

2: We consider the projected gradient algorithm for the nonconvex best subset selection

3: problem that minimizes a given empirical loss function under an $\ell_0$-norm

4: constraint.

5: Through decomposing the feasible set of the given sparsity constraint

6: as a finite union of linear subspaces, we present two acceleration

7: schemes with global convergence guarantees, one by same-space

8: extrapolation and the other by subspace identification.

9: The former fully utilizes the problem structure to greatly accelerate

10: the optimization speed with only negligible additional cost.

11: The latter leads to a two-stage meta-algorithm that first uses

12: classical projected gradient iterations to identify the correct

13: subspace containing an optimal solution, and then switches to

14: a highly-efficient smooth optimization method in the identified

15: subspace to attain superlinear convergence.

16: Experiments demonstrate that the proposed accelerated algorithms are

17: magnitudes faster than their non-accelerated counterparts as well as

18: the state of the art.

19: \end{abstract}