81816213e17fd3bf.tex
1: \begin{abstract} 
2:   Multi-agent reinforcement learning has been successfully applied to a number 
3:   of challenging problems. Despite these empirical successes, theoretical
4:   understanding of different algorithms is lacking, primarily due
5:   to the curse of dimensionality caused by the exponential growth
6:   of the state-action space with the number of agents. We study a 
7:   fundamental problem of multi-agent linear quadratic regulator
8:   in a setting where the agents are partially exchangeable. In this 
9:   setting, we develop a hierarchical actor-critic algorithm, whose
10:   computational complexity is independent of the total number of agents,
11:   and prove its global linear convergence to the optimal policy.
12:   As linear quadratic regulators are often used to approximate 
13:   general dynamic systems, this paper provided an important step
14:   towards better understanding of general hierarchical mean-field 
15:   multi-agent reinforcement learning.
16: \end{abstract}
17: