56da480921b1ca60.tex
1: \begin{abstract}
2: In this paper, a general stochastic optimization procedure is studied, unifying
3: several variants of the stochastic gradient descent such as, among others, the
4: stochastic heavy ball method, the Stochastic Nesterov Accelerated Gradient
5: algorithm (S-NAG), and the widely used \textsc{Adam} algorithm. The algorithm
6: is seen as a noisy Euler discretization of a non-autonomous ordinary
7: differential equation, recently introduced by Belotto da Silva and Gazeau,
8: which is analyzed in depth.  Assuming that the objective function is non-convex
9: and differentiable, the stability and the almost sure convergence of the
10: iterates to the set of critical points are established.  A noteworthy special
11: case is the convergence proof of S-NAG in a non-convex setting.  Under some
12: assumptions, the convergence rate is provided under the form of a Central Limit
13: Theorem.  Finally, the non-convergence of the algorithm to undesired critical
14: points, such as local maxima or saddle points, is established.  Here, the main
15: ingredient is a new avoidance of traps result for non-autonomous settings,
16: which is of independent interest.
17: \end{abstract}
18: