a09111dc2948fe44.tex
1: \begin{abstract}Linear discriminant analysis (LDA) is a classical method
2: 		for dimensionality reduction, where
3: 		discriminant vectors are sought to project data to a lower dimensional
4: 		space for optimal separability of classes. Several recent papers have
5: 		outlined strategies for exploiting sparsity for using LDA with high-dimensional data.
6: 		However, many lack scalable methods for solution
7: 		of the underlying optimization problems.
8: 		% Our approach.
9: 		We propose three new numerical optimization schemes for solving
10: 		the sparse optimal scoring formulation of LDA based
11: 		on block coordinate descent, the proximal gradient method,
12: 		and the alternating direction method of multipliers.
13: 		% Results.
14: 		We show that the per-iteration cost of these methods scales linearly in
15: 		the
16: 		dimension of the data provided restricted regularization
17: 		terms are employed, and cubically in the dimension
18: 		of the data in the worst case.
19: 		% Convergence.
20: 		Furthermore, we establish that if our block coordinate descent
21: 		framework
22: 		generates convergent subsequences of iterates, then
23: 		these subsequences converge to the stationary points of the
24: 		sparse optimal scoring problem.
25: 		% Empirical results.
26: 		We demonstrate
27: 		the effectiveness of our new methods with empirical results
28: 		for classification of Gaussian data and data sets drawn from
29: 		benchmarking repositories, including time-series and multispectral X-ray data, and provide
30: 		 \texttt{Matlab} and \texttt{R} implementations of our optimization schemes.
31: 	\end{abstract}
32: