abstract:fce7a41227d2eb20.tex

1: \begin{abstract}

2: Proximal gradient method has been playing an important role to solve many machine learning tasks,

3: especially for the nonsmooth problems.

4: However, in some machine learning problems such as the bandit model and the black-box learning problem,

5: proximal gradient method could fail because the explicit gradients of these problems are difficult

6: or infeasible to obtain. The gradient-free (zeroth-order) method can address these problems because only the objective function values are required in the optimization.

7: Recently, the first zeroth-order proximal stochastic

8: algorithm was proposed to solve the nonconvex nonsmooth problems.

9: However, its convergence rate is $O(\frac{1}{\sqrt{T}})$ for the nonconvex problems,

10: which is significantly slower than the best convergence rate $O(\frac{1}{T})$ of the zeroth-order stochastic algorithm,

11: where $T$ is the iteration number.

12: To fill this gap, in the paper, we propose a class of faster zeroth-order proximal stochastic methods

13: with the variance reduction techniques of SVRG and SAGA,

14: which are denoted as ZO-ProxSVRG and ZO-ProxSAGA, respectively.

15: In theoretical analysis, we address the main challenge that

16: an unbiased estimate of the true gradient does not hold in the zeroth-order case,

17: which was required in previous theoretical analysis of both SVRG and SAGA.

18: Moreover,  we prove that both ZO-ProxSVRG and ZO-ProxSAGA algorithms have

19: $O(\frac{1}{T})$ convergence rates.

20: Finally, the experimental results

21: verify that our algorithms have

22: a faster convergence rate than the existing zeroth-order proximal stochastic algorithm.

23: \end{abstract}

24: