eee370ca9e1010e4.tex
1: \begin{abstract}
2: We study the problem of no-regret learning algorithms for general monotone and smooth games and their last-iterate convergence properties. Specifically, we investigate the problem under bandit feedback and strongly uncoupled dynamics, which allows modular development of the multi-player system that applies to a wide range of real applications. We propose a mirror-descent-based algorithm, which converges in $O(T^{-1/4})$ and is also no-regret. The result is achieved by a dedicated use of two regularizations and the analysis of the fixed point thereof.
3: The convergence rate is further improved to $O(T^{-1/2})$ in the case of strongly monotone games.
4: Motivated by practical tasks where the game evolves over time, the algorithm is extended to time-varying monotone games. We provide the first non-asymptotic result in converging monotone games and give improved results for equilibrium tracking games.
5: 
6: %games, our algorithm improves the results for . We further verify the effectiveness of our algorithm with empirical evaluations. 
7: 
8: %To our best knowledge, this is the first uncoupled and convergent algorithm in general monotone games under bandit feedback. 
9: %We then extend our results to 
10: 
11: %same alg improved results
12: \end{abstract}
13: