abstract:166dcba79ffbc598.tex

1: \begin{abstract}

2: Variance reduction techniques are designed to decrease the sampling variance, thereby accelerating convergence rates of first-order (FO) and zeroth-order (ZO) optimization methods.

3: However, in composite optimization problems, ZO methods encounter an additional variance called the coordinate-wise variance, which stems from the random gradient estimation.

4: To reduce this variance, prior works require estimating all partial derivatives, essentially approximating FO information.

5: This approach demands $\OM(d)$ function evaluations ($d$ is the dimension size), which incurs substantial computational costs and is prohibitive in high-dimensional scenarios.

6: This paper proposes the Zeroth-order Proximal Double Variance Reduction (\texttt{ZPDVR}) method, which utilizes the averaging trick to reduce both sampling and coordinate-wise variances.

7: Compared to prior methods, \texttt{ZPDVR} relies solely on random gradient estimates, calls the stochastic zeroth-order oracle (SZO) in expectation $\OM(1)$ times per iteration, and achieves the optimal $\OM(d(n + \kappa)\log (\frac{1}{\epsilon}))$ SZO query complexity in the strongly convex and smooth setting, where $\kappa$ represents the condition number and $\epsilon$ is the desired accuracy.

8: Empirical results validate \texttt{ZPDVR}'s linear convergence and demonstrate its superior performance over other related methods.

9: \end{abstract}

10: