abstract:1b808adc81c9107c.tex

1: \begin{abstract}

2: Sparse coding is a core building block in many data analysis and machine learning pipelines.

3: Typically it is solved by relying on generic optimization techniques, such as the Iterative Soft Thresholding Algorithm and its accelerated version (ISTA, FISTA).

4: These methods are optimal in the class of first-order methods for non-smooth, convex functions.

5: However, they do not exploit the particular structure of the problem at hand nor the input data distribution.

6: An acceleration using neural networks, coined LISTA, was proposed in \cite{Gregor10}, which showed empirically that one could achieve high quality estimates with few iterations by modifying the parameters of the proximal splitting appropriately.

7:

8: In this paper we study the reasons for such acceleration.

9: Our mathematical analysis reveals that it is related to a specific matrix factorization of the Gram kernel of the dictionary, which attempts to nearly diagonalise the kernel with a basis that produces a small perturbation of the $\ell_1$ ball.

10: When this factorization succeeds, we prove that the resulting splitting algorithm enjoys an improved convergence bound with respect to the non-adaptive version.

11: Moreover, our analysis also shows that conditions for acceleration occur mostly at the beginning of the iterative process, consistent with numerical experiments.

12: We further validate our analysis by showing that on dictionaries where this factorization does not exist, adaptive acceleration fails.

13:

14:

15: \end{abstract}

16: