abstract:c89ad80cde37cd5a.tex

1: \begin{abstract}

2: In this paper we present a `stability theorem' for stochastic approximation (SA)

3: algorithms with `controlled Markov' noise.

4: Such algorithms were first studied by \textbf{Borkar} in \textit{2006}.

5: %Sufficient conditions are presented for the `stability and convergence' of such algorithms.

6: Specifically, sufficient conditions are presented

7: %, that are consistent with the assumptions made by \textbf{Borkar},

8: which guarantee the stability of the iterates.

9: Further, under these conditions the iterates are shown to track a solution to the differential

10: inclusion defined in terms of the ergodic occupation measures associated with the

11: `controlled Markov' process. As an application to our main result

12: we present an improvement to a general form of temporal difference learning algorithms.

13: Specifically, we present sufficient conditions for their stability and convergence using

14: our framework.

15: This paper builds on the works of \textbf{Borkar}

16: and \textbf{Benveniste, Metivier and Priouret}.

17: \end{abstract}

18: