1: \begin{abstract}
2: A variety of practical problems can be modeled by the decision-making process in multi-player games where a group of self-interested players aim at optimizing their own local objectives, while the objectives depend on the actions taken by others.
3: The local gradient information of each player, essential in implementing algorithms for finding game solutions, is all too often unavailable.
4: In this paper, we focus on designing solution algorithms for multi-player games using bandit feedback, i.e., the only available feedback at each player's disposal is the realized objective values.
5: To tackle the issue of large variances in the existing bandit learning algorithms with a single oracle call, we propose two algorithms by integrating the residual feedback scheme into single-call extra-gradient methods.
6: Subsequently, we show that the actual sequences of play can converge almost surely to a critical point if the game is pseudo-monotone plus and characterize the convergence rate to the critical point when the game is strongly pseudo-monotone.
7: The ergodic convergence rates of the generated sequences in monotone games are also investigated as a supplement.
8: Finally, the validity of the proposed algorithms is further verified via numerical examples.
9: \end{abstract}
10: