1: \begin{abstract}
2: Proximal gradient method has been playing an important role to solve many machine learning tasks,
3: especially for the nonsmooth problems.
4: However, in some machine learning problems such as the bandit model and the black-box learning problem,
5: proximal gradient method could fail because the explicit gradients of these problems are difficult
6: or infeasible to obtain. The gradient-free (zeroth-order) method can address these problems because only the objective function values are required in the optimization.
7: Recently, the first zeroth-order proximal stochastic
8: algorithm was proposed to solve the nonconvex nonsmooth problems.
9: However, its convergence rate is $O(\frac{1}{\sqrt{T}})$ for the nonconvex problems,
10: which is significantly slower than the best convergence rate $O(\frac{1}{T})$ of the zeroth-order stochastic algorithm,
11: where $T$ is the iteration number.
12: To fill this gap, in the paper, we propose a class of faster zeroth-order proximal stochastic methods
13: with the variance reduction techniques of SVRG and SAGA,
14: which are denoted as ZO-ProxSVRG and ZO-ProxSAGA, respectively.
15: In theoretical analysis, we address the main challenge that
16: an unbiased estimate of the true gradient does not hold in the zeroth-order case,
17: which was required in previous theoretical analysis of both SVRG and SAGA.
18: Moreover, we prove that both ZO-ProxSVRG and ZO-ProxSAGA algorithms have
19: $O(\frac{1}{T})$ convergence rates.
20: Finally, the experimental results
21: verify that our algorithms have
22: a faster convergence rate than the existing zeroth-order proximal stochastic algorithm.
23: \end{abstract}
24: