5ad76b147964b0eb.tex
1: \begin{abstract}
2:     Despite the strong theoretical guarantees that variance-reduced finite-sum 
3:     optimization algorithms enjoy, their applicability remains limited to cases 
4:     where the memory overhead they introduce (SAG/SAGA), or the periodic full 
5:     gradient computation they require (SVRG/SARAH) are manageable.
6:     A promising approach to achieving variance reduction while avoiding these 
7:     drawbacks is the use of importance sampling instead of control variates. 
8:     While many such methods have been proposed in the literature,
9:     directly proving that they improve the convergence of the resulting optimization 
10:     algorithm has remained elusive.
11:     In this work, we propose an importance-sampling-based algorithm we call SRG
12:     (stochastic reweighted gradient).
13:     We analyze the convergence of SRG in the strongly-convex case and show that, while
14:     it does not recover the linear rate of control variates methods, it provably
15:     outperforms SGD.
16:     We pay particular attention to the time and memory overhead of our proposed method,
17:     and design a specialized red-black tree allowing its efficient
18:     implementation. Finally, we present empirical results to support our findings.
19: \end{abstract}
20: