8d6171f3bd8f11e6.tex
1: \begin{abstract}
2:     In machine learning and statistical data analysis, we often run into objective function that is a summation: the number of terms in the summation possibly is equal to the sample size, which can be enormous.
3: In such a setting, the stochastic mirror descent (SMD) algorithm is a numerically efficient method---each iteration involving a very small subset of the data. 
4: The variance reduction version of SMD (VRSMD) can further improve SMD by inducing faster convergence. 
5: On the other hand, algorithms such as gradient descent and stochastic gradient descent have the implicit regularization property that leads to better performance in terms of the generalization errors.
6: Little is known on whether such a property holds for VRSMD. 
7: We prove here that the discrete VRSMD estimator sequence converges to the minimum mirror interpolant in the linear regression. 
8: This establishes the implicit regularization property for VRSMD.
9: As an application of the above result, we derive a model estimation accuracy result in the setting when the true model is sparse. 
10: We use numerical examples to illustrate the empirical power of VRSMD. 
11: \end{abstract}
12: