1: \begin{abstract}Linear discriminant analysis (LDA) is a classical method
2: for dimensionality reduction, where
3: discriminant vectors are sought to project data to a lower dimensional
4: space for optimal separability of classes. Several recent papers have
5: outlined strategies for exploiting sparsity for using LDA with high-dimensional data.
6: However, many lack scalable methods for solution
7: of the underlying optimization problems.
8: % Our approach.
9: We propose three new numerical optimization schemes for solving
10: the sparse optimal scoring formulation of LDA based
11: on block coordinate descent, the proximal gradient method,
12: and the alternating direction method of multipliers.
13: % Results.
14: We show that the per-iteration cost of these methods scales linearly in
15: the
16: dimension of the data provided restricted regularization
17: terms are employed, and cubically in the dimension
18: of the data in the worst case.
19: % Convergence.
20: Furthermore, we establish that if our block coordinate descent
21: framework
22: generates convergent subsequences of iterates, then
23: these subsequences converge to the stationary points of the
24: sparse optimal scoring problem.
25: % Empirical results.
26: We demonstrate
27: the effectiveness of our new methods with empirical results
28: for classification of Gaussian data and data sets drawn from
29: benchmarking repositories, including time-series and multispectral X-ray data, and provide
30: \texttt{Matlab} and \texttt{R} implementations of our optimization schemes.
31: \end{abstract}
32: