1: \begin{abstract}
2: We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component.
3: We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+.
4: Our main contribution lies in the analysis of ProxSVRG+.
5: It recovers several existing convergence results and improves/generalizes them (in terms of the number of stochastic gradient oracle calls and proximal oracle calls).
6: In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by \citep{lei2017non} for the smooth nonconvex case.
7: ProxSVRG+ is also more straightforward than SCSG and yields simpler analysis.
8: Moreover, ProxSVRG+ outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, which partially solves an open problem proposed in \citep{reddi2016proximal}.
9: Also, ProxSVRG+ uses much less proximal oracle calls than ProxSVRG \citep{reddi2016proximal}.
10: Moreover, for nonconvex functions satisfied Polyak-\L{}ojasiewicz condition, we prove that ProxSVRG+ achieves a global linear convergence rate without restart unlike ProxSVRG.
11: Thus, it can \emph{automatically} switch to the faster linear convergence in some regions as long as the objective function satisfies the PL condition locally in these regions.
12: ProxSVRG+ also improves ProxGD and ProxSVRG/SAGA, and generalizes the results of SCSG in this case.
13: Finally, we conduct several experiments and the experimental results are consistent with the theoretical results.
14: \end{abstract}
15: