51f608087da0327e.tex
1: \begin{abstract} 
2:     The behaviour of multi-agent learning in competitive network games is often studied within the context of 
3:     zero-sum games, in which convergence guarantees may be obtained. However, outside of this class the behaviour
4:     of learning is known to display complex behaviours and convergence cannot be always guaranteed. Nonetheless, in order
5:     to develop a complete picture of the behaviour of multi-agent learning in competitive settings, the zero-sum
6:     assumption must be lifted. 
7:     
8:     Motivated by this we study the Q-Learning dynamics, a popular model of exploration and exploitation in multi-agent learning, in competitive network games. We determine how
9:     the degree of competition, exploration rate and network connectivity impact the convergence of Q-Learning. To
10:     study generic competitive games, we parameterise network games in terms of correlations between agent payoffs and
11:     study the average behaviour of the Q-Learning dynamics across all games drawn from a choice of this parameter.
12:     This statistical approach establishes choices of parameters for which Q-Learning dynamics converge to a stable
13:     fixed point. Differently to previous works, we find that the stability of Q-Learning is explicitly dependent
14:     only on the network connectivity rather than the total number of agents. Our experiments validate these
15:     findings and show that, under certain network structures, the total number of agents can be increased without
16:     increasing the likelihood of unstable or chaotic behaviours. 
17: \end{abstract}
18: