e79be98a598ca92e.tex
1: \begin{abstract}
2: Reinforcement learning has significant applications for
3: multi-agent systems, especially in unknown dynamic environments.
4: However, most multi-agent reinforcement learning (MARL) algorithms
5: suffer from such problems as exponential computation complexity in
6: the joint state-action space, which makes it difficult to scale up
7: to realistic multi-agent problems. In this paper, a novel
8: algorithm named negotiation-based MARL with sparse interactions
9: (NegoSI) is presented. In contrast to traditional
10: sparse-interaction based MARL algorithms, NegoSI adopts the
11: equilibrium concept and makes it possible for agents to select the
12: non-strict Equilibrium Dominating Strategy Profile (non-strict
13: EDSP) or Meta equilibrium for their joint actions. The presented
14: NegoSI algorithm consists of four parts: the equilibrium-based
15: framework for sparse interactions, the negotiation for the
16: equilibrium set, the minimum variance method for selecting one
17: joint action and the knowledge transfer of local Q-values. In this
18: integrated algorithm, three techniques, i.e., unshared value
19: functions, equilibrium solutions and sparse interactions are
20: adopted to achieve privacy protection, better coordination and
21: lower computational complexity, respectively. To evaluate the
22: performance of the presented NegoSI algorithm, two groups of
23: experiments are carried out regarding three criteria: steps of
24: each episode (SEE), rewards of each episode (REE) and average
25: runtime (AR). The first group of experiments is conducted using
26: six grid world games and shows fast convergence and high scalability of
27: the presented algorithm. Then in the second group of experiments
28: NegoSI is applied to an intelligent warehouse problem and
29: simulated results demonstrate the effectiveness of the presented
30: NegoSI algorithm compared with other state-of-the-art MARL
31: algorithms.
32: \end{abstract}