da6fbdeaed6e4630.tex
1: \begin{abstract}
2: Stochastic gradient descent (SGD) and its variants are among the most successful approaches for solving large-scale
3: optimization problems. At each iteration, SGD employs an unbiased estimator of the full gradient computed from
4: one single randomly selected data point. Hence, it scales well with problem size and is very attractive for handling
5: truly massive dataset, and holds significant potentials for solving large-scale inverse problems. In this work, we
6: rigorously establish its regularizing property under \textit{a priori} early stopping rule for linear
7: inverse problems, and also prove convergence rates under the canonical sourcewise condition.
8: This is achieved by combining tools from classical regularization theory and stochastic analysis. Further, we analyze its
9: preasymptotic weak and strong convergence behavior, in order to explain the fast initial convergence typically
10: observed in practice. The theoretical findings shed insights into the performance of the algorithm, and are
11: complemented with illustrative numerical experiments.\\
12: {\bf Keywords}: stochastic gradient descent; regularizing property; error estimates; preasymptotic convergence.
13: \end{abstract}