abstract:5ad76b147964b0eb.tex

1: \begin{abstract}

2:     Despite the strong theoretical guarantees that variance-reduced finite-sum

3:     optimization algorithms enjoy, their applicability remains limited to cases

4:     where the memory overhead they introduce (SAG/SAGA), or the periodic full

5:     gradient computation they require (SVRG/SARAH) are manageable.

6:     A promising approach to achieving variance reduction while avoiding these

7:     drawbacks is the use of importance sampling instead of control variates.

8:     While many such methods have been proposed in the literature,

9:     directly proving that they improve the convergence of the resulting optimization

10:     algorithm has remained elusive.

11:     In this work, we propose an importance-sampling-based algorithm we call SRG

12:     (stochastic reweighted gradient).

13:     We analyze the convergence of SRG in the strongly-convex case and show that, while

14:     it does not recover the linear rate of control variates methods, it provably

15:     outperforms SGD.

16:     We pay particular attention to the time and memory overhead of our proposed method,

17:     and design a specialized red-black tree allowing its efficient

18:     implementation. Finally, we present empirical results to support our findings.

19: \end{abstract}

20: