1: \begin{abstract}
2: This work proposes a policy learning algorithm for generalised feedback Nash equilibrium seeking in $N_P$-players non-cooperative dynamic games. We consider linear-quadratic games with stochastic dynamics and design a best-response dynamics in which players update and communicate a parametrisation of their state-feedback policies. Our approach leverages the System Level Synthesis (SLS) framework to formulate each player's update rule as the solution of a tractable robust optimisation problem. Under certain conditions, the conditions and rates of convergence can be established. The algorithm is showcased for an exemplary problem from decentralised control of multi-agent systems. %and {\color{red}oligopolistic competition}.
3: \end{abstract}
4: