abstract:8d6171f3bd8f11e6.tex

1: \begin{abstract}

2:     In machine learning and statistical data analysis, we often run into objective function that is a summation: the number of terms in the summation possibly is equal to the sample size, which can be enormous.

3: In such a setting, the stochastic mirror descent (SMD) algorithm is a numerically efficient method---each iteration involving a very small subset of the data.

4: The variance reduction version of SMD (VRSMD) can further improve SMD by inducing faster convergence.

5: On the other hand, algorithms such as gradient descent and stochastic gradient descent have the implicit regularization property that leads to better performance in terms of the generalization errors.

6: Little is known on whether such a property holds for VRSMD.

7: We prove here that the discrete VRSMD estimator sequence converges to the minimum mirror interpolant in the linear regression.

8: This establishes the implicit regularization property for VRSMD.

9: As an application of the above result, we derive a model estimation accuracy result in the setting when the true model is sparse.

10: We use numerical examples to illustrate the empirical power of VRSMD.

11: \end{abstract}

12: