7c17b2a7e3a9c746.tex
1: \begin{abstract}
2: 
3: We investigate the problem of decentralized multi-agent navigation tasks, where multiple agents need to reach initially unassigned targets in a limited time. Classical planning-based methods suffer from expensive computation overhead at each step and offer limited expressiveness for complex cooperation strategies. In contrast, reinforcement learning~(RL) has recently become a popular paradigm for addressing this issue. 
4: However, RL struggles with low data efficiency and cooperation when directly exploring (nearly) optimal policies in the large search space, especially with an increased agent number~(e.g., 10+ agents) or in complex environments~(e.g., 3$D$ simulators).
5: In this paper, we propose \emph{\underline{M}ulti-\underline{A}gent \underline{S}calable GNN-based \underline{P}lanner}~({\name}), a goal-conditioned hierarchical planner for navigation tasks with a substantial number of agents. {\name} adopts a hierarchical framework to divide a large search space into multiple smaller spaces, thereby reducing the space complexity and accelerating training convergence. We also leverage graph neural networks~(GNN) to model the interaction between agents and goals, improving goal achievement. Besides, to enhance generalization capabilities in scenarios with unseen team sizes, we divide agents into multiple groups, each with a previously trained number of agents.
6: The results demonstrate that {\name} outperforms classical planning-based competitors and RL baselines, achieving a nearly 100\% success rate with minimal training data in both multi-agent particle environments (MPE) with 50 agents and a quadrotor 3-dimensional environment (OmniDrones) with 20 agents. Furthermore, the learned policy showcases zero-shot generalization across unseen team sizes.\looseness=-1
7: \end{abstract}
8: