eef037a63b594da0.tex
1: \begin{abstract}
2:     Recent advances in multiagent learning have seen the introduction of a family of algorithms that revolve around the population-based training method PSRO, showing convergence to Nash, correlated and coarse correlated equilibria. 
3:     Notably, when the number of agents increases, learning best-responses becomes exponentially more difficult, and as such hampers PSRO training methods. 
4:     The paradigm of mean-field games provides an asymptotic solution to this problem when the considered games are anonymous-symmetric.
5:     Unfortunately, the mean-field approximation introduces non-linearities which prevent a straightforward adaptation of PSRO.
6:     Building upon optimization and adversarial regret minimization, this paper sidesteps this issue and introduces mean-field PSRO, an adaptation of PSRO which learns Nash, coarse correlated and correlated equilibria in mean-field games. 
7:     The key is to replace the exact distribution computation step by newly-defined mean-field no-adversarial-regret learners, or by black-box optimization. 
8:     We compare the asymptotic complexity of the approach to standard PSRO, greatly improve empirical bandit convergence speed by compressing temporal mixture weights, and ensure it is theoretically robust to payoff noise.
9:     Finally, we illustrate the speed and accuracy of mean-field PSRO on several mean-field games, demonstrating convergence to strong and weak equilibria.
10: \end{abstract}
11: