1: \begin{abstract}%
2: Sparse coding---that is, modelling data vectors as sparse linear
3: combinations of basis elements---is widely used in machine learning,
4: neuroscience, signal processing, and statistics. This paper focuses on the
5: large-scale matrix factorization problem that consists of {\em learning}
6: the basis set in order to adapt it to specific data. Variations of this
7: problem include dictionary learning in signal processing, non-negative
8: matrix factorization and sparse principal component analysis. In this
9: paper, we propose to address these tasks with a new online optimization
10: algorithm, based on stochastic approximations, which scales up gracefully
11: to large data sets with millions of training samples, and extends naturally
12: to various matrix factorization formulations, making it suitable for a
13: wide range of learning problems. A proof of convergence is presented,
14: along with experiments with natural images and genomic data demonstrating
15: that it leads to state-of-the-art performance in terms of speed and
16: optimization for both small and large data sets.
17: \end{abstract}
18: