772f7862ead85a12.tex
1: \begin{abstract}
2: We study the problem of minimizing the average of a very large number of smooth functions, which is of key importance in  training supervised  learning models.  One of the most celebrated methods in this context is the SAGA algorithm of \citet{SAGA}. Despite years of research on the topic, a general-purpose version of SAGA---one that would include arbitrary importance sampling and minibatching schemes---does not exist.  We remedy this situation and propose a general and flexible variant of SAGA following the {\em arbitrary sampling}  paradigm. We perform an iteration complexity analysis of the method, largely possible due to the construction of  new stochastic Lyapunov functions. We establish linear convergence rates in the smooth and  strongly convex regime, and  under a quadratic functional growth condition (i.e.,  in a  regime not assuming strong convexity). 
3: % SAGA is the only variance-reduced method achievinbg linear convergence without any a-priori knowldge of the error bound condition number.  
4: Our rates match those of the primal-dual method Quartz~\cite{Quartz} for which an arbitrary sampling analysis is available, which makes a significant step towards closing the gap in our understanding of complexity of primal and dual methods for finite sum problems. 
5: % Finally, we show through experiments that specific variants of our general SAGA method can perform better in practice than other competing methods.
6: \end{abstract}
7: