ea6197799b95ea83.tex
1: \begin{abstract}
2: 	
3: 	\kmeans -- and the celebrated Lloyd algorithm -- is more than the clustering method it was originally designed to be. 
4: 	It has indeed proven pivotal to help increase the speed of many machine learning and data analysis techniques such as indexing, nearest-neighbor search and prediction, data compression; its beneficial use has been shown to carry over to the acceleration of kernel machines (when using the Nyström method). 
5: 	Here, we propose a fast extension of \kmeans, dubbed \texttt{QuicK-means}, that rests on the idea of expressing the matrix of the $\nclusters$ centroids as a product of sparse matrices, a feat made possible by recent results devoted to find approximations of matrices as a product of sparse factors. Using such a decomposition squashes the complexity of the matrix-vector product between the factorized $\nclusters \times \datadim$ centroid matrix $\mathbf{U}$ and any vector from $\mathcal{O}(\nclusters \datadim)$ to $\mathcal{O}(A \log A+B)$, with $A=\min (\nclusters, \datadim)$ and $B=\max (\nclusters, \datadim)$, where $\datadim$ is the dimension of the training data. This drastic computational saving has a direct impact in the assignment process of a point to a cluster, meaning that it is not only tangible at prediction time, but also at training time, provided the factorization procedure is performed during Lloyd's algorithm. We precisely show that resorting to a factorization step at each iteration does not impair the convergence of the optimization scheme and that, depending on the context, it may entail a reduction of the training time. Finally, we provide discussions and numerical simulations that show the versatility of our computationally-efficient  \texttt{QuicK-means} algorithm. 
6: \end{abstract}
7: