abstract:66fc39b8ee3b2b6c.tex

1: \begin{abstract}

2:

3:     Stochastic gradient descent (SGD) method is popular for solving non-convex optimization problems in machine learning.

4:     This work investigates SGD from a viewpoint of graduated optimization, which is a widely applied approach for non-convex optimization problems.

5:     Instead of the actual optimization problem, a series of smoothed optimization problems that can be achieved in various ways are solved in the graduated optimization approach.

6:     In this work, a formal formulation of the graduated optimization is provided based on the nonnegative approximate identity, which generalizes the idea of Gaussian smoothing.

7:     Also, an asymptotic convergence result is achieved with the techniques in variational analysis.

8:     Then, we show that the traditional SGD method can be applied to solve the smoothed optimization problem.

9:     The Monte Carlo integration is used to achieve the gradient in the smoothed problem, which may be consistent with distributed computing schemes in real-life applications.

10:     From the assumptions on the actual optimization problem, the convergence results of SGD for the smoothed problem can be derived straightforwardly.

11:     Numerical examples show evidence that the graduated optimization approach may provide more accurate training results in certain cases.

12:

13: \end{abstract}

14: