1: \begin{abstract}
2: While there are numerous works in multi-agent reinforcement learning (MARL), most of them focus on designing algorithms and proving convergence to a Nash equilibrium (NE) or other equilibrium such as coarse correlated equilibrium. However, NEs can be non-unique and their performance varies drastically. Thus, it is important to design algorithms that converge to Nash equilibrium with better rewards or social welfare. In contrast, classical game theory literature has extensively studied equilibrium selection for multi-agent learning in normal-form games, demonstrating that decentralized learning algorithms can asymptotically converge to potential-maximizing or Pareto-optimal NEs. These insights motivate this paper to investigate equilibrium selection in the MARL setting. We focus on the stochastic game model, leveraging classical equilibrium selection results from normal-form games to propose a unified framework for equilibrium selection in stochastic games. The proposed framework is highly modular and can extend various learning rules and their corresponding equilibrium selection results from normal-form games to the stochastic game setting.
3:
4:
5: % On the other hand, in game theory literatures, there are a lot of classical works studying equilibrium selection for multi-agent learning in the normal-form game setting, showing that there exists decentralized learning algorithm that asymptotically converges to the global-optimal or pareto-optimal NE. The above observations serve as the major motivation of this paper: to study equilibrium selection in the multi-agent reinforcement learning setting. We focus on the stochastic game model and use the classical results of equilibrium selection in the normal-form game setting as a building block to propose a unified framework for equilibrium selection in the stochastic game setting.
6: \end{abstract}
7: