abstract:400cd060926d8118.tex

1: \begin{abstract}

2: Asynchronous stochastic approximations (SAs) are an important class of model-free algorithms, tools and techniques that are popular in multi-agent

3: and distributed control scenarios. To counter Bellman's curse of dimensionality, such algorithms are coupled with function approximations. Although the learning/ control problem becomes more tractable, function approximations affect stability and convergence.

4: In this paper, we present verifiable sufficient conditions for stability and convergence of asynchronous SAs with biased approximation errors. The theory developed herein is used to analyze

5: Policy Gradient methods

6: and noisy Value Iteration schemes.

7: Specifically, we analyze the asynchronous approximate counterparts of the policy gradient (A2PG)

8: and value iteration (A2VI) schemes.

9: It is shown that the stability of these algorithms is unaffected by

10: biased approximation errors, provided they are

11: asymptotically bounded. With respect to convergence (of A2VI and A2PG), a relationship between the limiting set and the approximation errors

12: is established. Finally, experimental results are presented that support the theory.

13: \end{abstract}

14: