51bfb2a1aaac3ade.tex
1: \begin{abstract}
2: 	In this paper, we propose a passivity-based methodology for analysis and design of reinforcement learning in multi-agent finite games. Starting from a known exponentially-discounted reinforcement learning scheme,  we show that convergence to a Nash distribution can be shown in the class of games characterized by the monotonicity property of their (negative) payoff. We further exploit passivity to propose  a class of higher-order schemes that  preserve convergence properties, can improve the speed of convergence and can even converge in cases whereby their first-order counterpart fail to converge. We demonstrate these properties through numerical simulations for several representative games.  
3: \end{abstract}
4: