66fc39b8ee3b2b6c.tex
1: \begin{abstract}
2: 
3:     Stochastic gradient descent (SGD) method is popular for solving non-convex optimization problems in machine learning.
4:     This work investigates SGD from a viewpoint of graduated optimization, which is a widely applied approach for non-convex optimization problems.
5:     Instead of the actual optimization problem, a series of smoothed optimization problems that can be achieved in various ways are solved in the graduated optimization approach.
6:     In this work, a formal formulation of the graduated optimization is provided based on the nonnegative approximate identity, which generalizes the idea of Gaussian smoothing.
7:     Also, an asymptotic convergence result is achieved with the techniques in variational analysis.
8:     Then, we show that the traditional SGD method can be applied to solve the smoothed optimization problem.
9:     The Monte Carlo integration is used to achieve the gradient in the smoothed problem, which may be consistent with distributed computing schemes in real-life applications.
10:     From the assumptions on the actual optimization problem, the convergence results of SGD for the smoothed problem can be derived straightforwardly.
11:     Numerical examples show evidence that the graduated optimization approach may provide more accurate training results in certain cases. 
12: 
13: \end{abstract}
14: