1dfc1559015074b5.tex
1: \begin{abstract}
2: Network Markov Decision Processes (MDPs), a popular model for multi-agent control, pose a significant challenge to efficient learning due to the exponential growth of the global state-action space with the number of agents. 
3: In this work, utilizing the exponential decay property of network dynamics, we first derive scalable spectral local representations for network MDPs, 
4: % which have enabled efficient learning for single-agent MDPs, 
5: which induces a network linear subspace for the local $Q$-function of each agent. 
6: Building on these local spectral representations, we design a scalable algorithmic framework for continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm. Empirically, we validate the effectiveness of our scalable representation-based approach on two benchmark problems, and demonstrate the advantages of our approach over generic function approximation approaches to representing the local $Q$-functions.
7: 
8: % \lina{We can either make a short abstract by not talking too much of the background/literature or if we talk, we should talk about the work which did function approximation--using a NN to approximate Q for continuous state/action.} \zhaolin{I made it shorter.}
9: \end{abstract}
10: