abstract:aeb79db893300056.tex

1: \begin{abstract}

2: Constrained Markov games offer a formal mathematical framework for modeling multi-agent reinforcement learning problems where the behavior of the agents is subject to constraints.

3: In this work, we focus on the recently introduced class of constrained Markov Potential Games.

4: While centralized algorithms have been proposed for solving such constrained games, the design of converging independent learning algorithms tailored for the constrained setting remains an open question.

5: We propose an independent policy gradient algorithm for learning approximate constrained Nash equilibria: Each agent observes their own actions and rewards, along with a shared state.

6: Inspired by the optimization literature, our algorithm performs proximal-point-like updates  augmented with a regularized constraint set. Each proximal step is solved inexactly using a stochastic switching gradient algorithm.

7: Notably, our algorithm can be implemented independently without a centralized coordination mechanism requiring turn-based agent updates.

8: Under some technical constraint qualification conditions, we establish convergence guarantees towards constrained approximate Nash equilibria.

9: We perform simulations to illustrate our results.

10: \end{abstract}

11: