abstract:dc0ca10c895934ce.tex

1: \begin{abstract}%

2:    Sparse coding---that is, modelling data vectors as sparse linear

3:    combinations of basis elements---is widely used in machine learning,

4:    neuroscience, signal processing, and statistics. This paper focuses on the

5:    large-scale matrix factorization problem that consists of {\em learning}

6:    the basis set in order to adapt it to specific data. Variations of this

7:    problem include dictionary learning in signal processing, non-negative

8:    matrix factorization and sparse principal component analysis. In this

9:    paper, we propose to address these tasks with a new online optimization

10:    algorithm, based on stochastic approximations, which scales up gracefully

11:    to large data sets with millions of training samples, and extends naturally

12:    to various  matrix factorization formulations, making it suitable for a

13:    wide range of learning problems. A proof of convergence is presented,

14:    along with experiments with natural images and genomic data demonstrating

15:    that it leads to state-of-the-art performance in terms of speed and

16:    optimization for both small and large data sets.

17: \end{abstract}

18: