1: \begin{abstract}
2: Variance reduction
3: has been commonly used in stochastic optimization.
4: %on regularized risk minimization.
5: It relies crucially on the assumption that the data set is finite.
6: However,
7: when the data are imputed with random noise as in data augmentation,
8: %the objective involves an expectation over noise,
9: the perturbed data set becomes essentially infinite.
10: Recently, the stochastic MISO (S-MISO)
11: algorithm is introduced to address this
12: expected risk minimization problem.
13: Though it
14: converges faster than SGD,
15: a significant amount of memory
16: is required.
17: In this paper, we propose two SGD-like algorithms for expected risk minimization with random perturbation, namely,
18: stochastic sample average gradient (SSAG) and stochastic SAGA (S-SAGA).
19: The memory cost of SSAG does not depend on the sample size, while that of S-SAGA is the same as those of variance reduction methods on unperturbed data.
20: Theoretical analysis and
21: experimental results on logistic regression and AUC maximization
22: show that SSAG has faster convergence rate than SGD with comparable space requirement, while S-SAGA outperforms S-MISO in terms of both iteration complexity and
23: storage.
24: %with dropout noise.
25:
26: %Convergence results on strongly convex and composite objectives are provided. the convergence rate of one proposed variant S-SAGA relies on a small constant factor, which depends on the variance of derivative of the loss due to random perturbations of input data. The experimental results show that the proposed methods significantly outperform SGD and S-MISO with little memory cost.
27:
28: \end{abstract}