abstract:c1b837961b9007aa.tex

1: \begin{abstract}

2: \vspace{-.1in}

3:     Many stochastic optimization algorithms work by estimating the gradient of the cost function on the fly by sampling datapoints uniformly at random from a training set.

4: 	%

5: However, the estimator might have a large variance, which inadvertantly slows down the convergence rate of the algorithms.

6: %

7: One way to reduce this variance is to  sample the datapoints from a carefully selected non-uniform distribution. %, which then need to be determined, and is a challenging task.

8: %

9: % Previous work minimizes an upper bound of the variance, but the gap between this upper bound and the optimal variance may remain large.

10: In this work, we propose a novel  non-uniform sampling approach that uses the multi-armed bandit framework.

11: %

12: Theoretically, we show that our algorithm asymptotically approximates the optimal variance within a factor of 3.

13: %

14: Empirically, we show that using this datapoint-selection technique results in a significant reduction of the convergence time and variance of several stochastic optimization algorithms such as SGD and SAGA.

15: %

16: This approach for sampling datapoints is general, and can be used in conjunction with \emph{any} algorithm that uses an unbiased gradient estimation -- we expect it to have broad applicability beyond the specific examples explored in this work.

17: \end{abstract}

18: