1: \begin{abstract}
2: Constrained Markov games offer a formal mathematical framework for modeling multi-agent reinforcement learning problems where the behavior of the agents is subject to constraints.
3: In this work, we focus on the recently introduced class of constrained Markov Potential Games.
4: While centralized algorithms have been proposed for solving such constrained games, the design of converging independent learning algorithms tailored for the constrained setting remains an open question.
5: We propose an independent policy gradient algorithm for learning approximate constrained Nash equilibria: Each agent observes their own actions and rewards, along with a shared state.
6: Inspired by the optimization literature, our algorithm performs proximal-point-like updates augmented with a regularized constraint set. Each proximal step is solved inexactly using a stochastic switching gradient algorithm.
7: Notably, our algorithm can be implemented independently without a centralized coordination mechanism requiring turn-based agent updates.
8: Under some technical constraint qualification conditions, we establish convergence guarantees towards constrained approximate Nash equilibria.
9: We perform simulations to illustrate our results.
10: \end{abstract}
11: