1: \begin{abstract}
2: Multi-agent reinforcement learning has been successfully applied to a number
3: of challenging problems. Despite these empirical successes, theoretical
4: understanding of different algorithms is lacking, primarily due
5: to the curse of dimensionality caused by the exponential growth
6: of the state-action space with the number of agents. We study a
7: fundamental problem of multi-agent linear quadratic regulator
8: in a setting where the agents are partially exchangeable. In this
9: setting, we develop a hierarchical actor-critic algorithm, whose
10: computational complexity is independent of the total number of agents,
11: and prove its global linear convergence to the optimal policy.
12: As linear quadratic regulators are often used to approximate
13: general dynamic systems, this paper provided an important step
14: towards better understanding of general hierarchical mean-field
15: multi-agent reinforcement learning.
16: \end{abstract}
17: