1: \begin{abstract}
2: [Zhang, ICML 2018] provided
3: the first decentralized actor-critic algorithm for multi-agent reinforcement learning (MARL) that offers convergence guarantees.
4: In that work, policies are stochastic and are defined on finite action spaces.
5: We extend those results to offer a provably-convergent decentralized actor-critic algorithm
6: for learning deterministic policies on continuous action spaces. Deterministic policies are important in real-world settings. To handle the lack of exploration inherent in deterministic policies, we consider both off-policy and on-policy settings.
7: We provide the expression of a local deterministic policy gradient,
8: decentralized deterministic actor-critic algorithms
9: and convergence guarantees for linearly-approximated value functions.
10: This work will help enable decentralized MARL in high-dimensional action spaces
11: and pave the way for more widespread use of MARL.
12: \end{abstract}
13: