1: \begin{abstract}
2: The behaviour of multi-agent learning in competitive network games is often studied within the context of
3: zero-sum games, in which convergence guarantees may be obtained. However, outside of this class the behaviour
4: of learning is known to display complex behaviours and convergence cannot be always guaranteed. Nonetheless, in order
5: to develop a complete picture of the behaviour of multi-agent learning in competitive settings, the zero-sum
6: assumption must be lifted.
7:
8: Motivated by this we study the Q-Learning dynamics, a popular model of exploration and exploitation in multi-agent learning, in competitive network games. We determine how
9: the degree of competition, exploration rate and network connectivity impact the convergence of Q-Learning. To
10: study generic competitive games, we parameterise network games in terms of correlations between agent payoffs and
11: study the average behaviour of the Q-Learning dynamics across all games drawn from a choice of this parameter.
12: This statistical approach establishes choices of parameters for which Q-Learning dynamics converge to a stable
13: fixed point. Differently to previous works, we find that the stability of Q-Learning is explicitly dependent
14: only on the network connectivity rather than the total number of agents. Our experiments validate these
15: findings and show that, under certain network structures, the total number of agents can be increased without
16: increasing the likelihood of unstable or chaotic behaviours.
17: \end{abstract}
18: