abstract:b584034fadfff2fb.tex

1: \begin{abstract}

2: In many estimation problems, \eg linear and logistic regression, we

3: wish to minimize an unknown objective given only unbiased samples

4: of the objective function. Furthermore, we aim to achieve this using as few samples as

5: possible.  In the absence of computational constraints, the

6: minimizer of a sample average of observed data -- commonly referred

7: to as either the empirical risk minimizer (ERM) or the $M$-estimator

8: -- is widely regarded as the estimation strategy of choice due to

9: its desirable statistical convergence properties. Our goal in this work is to perform

10: as well as the ERM, on \emph{every} problem,

11: while minimizing the use of computational resources such as running

12: time and space usage.

13:

14: We provide a simple streaming algorithm which, under

15: standard regularity assumptions on the underlying problem, enjoys

16: the following properties:

17: \begin{enumerate}[noitemsep]

18: \item The algorithm can be implemented in linear time with a single

19: pass of the observed data, using space linear in the size of a

20: single sample.

21: \item The algorithm achieves the same statistical rate of

22: convergence as the empirical risk minimizer on every problem, even

23: considering constant factors.

24: \item The algorithm's performance depends on the initial error at a

25: rate that decreases super-polynomially.

26: \item The algorithm is easily parallelizable.

27: \end{enumerate}

28: Moreover, we quantify the (finite-sample) rate at which the

29: algorithm becomes competitive with the ERM.

30: \end{abstract}

31: