0f10da069684b5e9.tex
1: \begin{abstract}
2:   We consider derivative-free algorithms for stochastic and
3:   non-stochastic convex optimization problems that use only function values
4:   rather than gradients.  Focusing on non-asymptotic bounds on
5:   convergence rates, we show that if pairs of function values are
6:   available, algorithms for $d$-dimensional optimization that
7:   use gradient estimates based on random perturbations suffer a factor
8:   of at most $\sqrt{d}$ in convergence rate over traditional
9:   stochastic gradient methods.  We establish such results for both
10:   smooth and non-smooth cases, sharpening previous analyses that
11:   suggested a worse dimension dependence, and extend our
12:   results to the case of multiple ($\numobs \ge 2$) evaluations.
13:   We complement our
14:   algorithmic development with information-theoretic lower bounds on
15:   the minimax convergence rate of such problems, establishing the
16:   sharpness of our achievable results up to constant (sometimes
17:   logarithmic) factors.
18: \end{abstract}
19: