1: \begin{abstract}
2: We consider derivative-free algorithms for stochastic and
3: non-stochastic convex optimization problems that use only function values
4: rather than gradients. Focusing on non-asymptotic bounds on
5: convergence rates, we show that if pairs of function values are
6: available, algorithms for $d$-dimensional optimization that
7: use gradient estimates based on random perturbations suffer a factor
8: of at most $\sqrt{d}$ in convergence rate over traditional
9: stochastic gradient methods. We establish such results for both
10: smooth and non-smooth cases, sharpening previous analyses that
11: suggested a worse dimension dependence, and extend our
12: results to the case of multiple ($\numobs \ge 2$) evaluations.
13: We complement our
14: algorithmic development with information-theoretic lower bounds on
15: the minimax convergence rate of such problems, establishing the
16: sharpness of our achievable results up to constant (sometimes
17: logarithmic) factors.
18: \end{abstract}
19: