b584034fadfff2fb.tex
1: \begin{abstract}
2: In many estimation problems, \eg linear and logistic regression, we
3: wish to minimize an unknown objective given only unbiased samples
4: of the objective function. Furthermore, we aim to achieve this using as few samples as
5: possible.  In the absence of computational constraints, the
6: minimizer of a sample average of observed data -- commonly referred
7: to as either the empirical risk minimizer (ERM) or the $M$-estimator
8: -- is widely regarded as the estimation strategy of choice due to
9: its desirable statistical convergence properties. Our goal in this work is to perform
10: as well as the ERM, on \emph{every} problem,
11: while minimizing the use of computational resources such as running
12: time and space usage.
13: 
14: We provide a simple streaming algorithm which, under
15: standard regularity assumptions on the underlying problem, enjoys
16: the following properties:
17: \begin{enumerate}[noitemsep]
18: \item The algorithm can be implemented in linear time with a single
19: pass of the observed data, using space linear in the size of a
20: single sample.
21: \item The algorithm achieves the same statistical rate of
22: convergence as the empirical risk minimizer on every problem, even
23: considering constant factors.
24: \item The algorithm's performance depends on the initial error at a
25: rate that decreases super-polynomially.
26: \item The algorithm is easily parallelizable.
27: \end{enumerate}
28: Moreover, we quantify the (finite-sample) rate at which the
29: algorithm becomes competitive with the ERM.
30: \end{abstract}
31: