abstract:21e4744e321d12d7.tex

1: \begin{abstract}

2: \textbf{This paper is dedicated to the memory of Boris Teodorovich Polyak.}

3:

4: In this paper, we study the convergence properties of the Stochastic

5: Gradient Descent (SGD) method for finding a stationary point

6: of a given objective function $J(\cdot)$.

7: The objective function is not required to be convex.

8: Rather, our results apply to a class of ``invex'' functions, which have the

9: property that every stationary point is also a global minimizer.

10: First, it is assumed that $J(\cdot)$ satisfies a property that

11: is slightly weaker than the Kurdyka-Lojasiewicz (KL) condition,

12: denoted here as (KL').

13: It is shown that the iterations $J(\bth_t)$ converge almost surely

14: to the global minimum of $J(\cdot)$.

15: Next, the hypothesis on $J(\cdot)$ is strengthened from (KL') to

16: the Polyak-Lojasiewicz (PL) condition.

17: With this stronger hypothesis, we derive estimates on the rate of

18: convergence of $J(\bth_t)$ to its limit.

19: Using these results, we show that for functions satisfying the PL property,

20: the convergence rate of both the objective function

21: and the norm of the gradient with SGD is the same as the best-possible rate for convex

22: functions.

23: While some results along these lines have been published in the past,

24: our contributions contain two distinct improvements.

25: First, the assumptions on the stochastic gradient are more general

26: than elsewhere, and second, our convergence is almost sure, and not

27: in expectation.

28: We also study SGD when only function evaluations are permitted.

29: In this setting, we determine the ``optimal'' increments or the size

30: of the perturbations.

31: Using the same set of ideas, we establish the global convergence

32: of the Stochastic Approximation (SA) algorithm under more general

33: assumptions on the measurement error, compared to the existing literature.

34: We also derive bounds on the rate of convergence of the SA algorithm

35: under appropriate assumptions.

36:

37: \end{abstract}

38: