1: \begin{abstract}
2: Reinforcement learning has recently gained unprecedented popularity, yet it still grapples with sample inefficiency. Addressing this challenge, federated reinforcement learning (FedRL) has emerged, wherein agents collaboratively learn a single policy by aggregating local estimations. However, this aggregation step incurs significant communication costs.
3: In this paper, we propose \compfedrl, a communication-efficient FedRL approach incorporating both \emph{periodic aggregation} and \emph{(direct/error-feedback) compression} mechanisms. Specifically, we consider compressed federated $Q$-learning with a generative model setup, where a central server learns an optimal $Q$-function by periodically aggregating compressed $Q$-estimates from local agents. For the first time, we characterize the impact of these two mechanisms (which have remained elusive) by providing a finite-time analysis of our algorithm, demonstrating strong convergence behaviors when utilizing either direct or error-feedback compression. Our bounds indicate improved solution accuracy concerning the number of agents and other federated hyperparameters while simultaneously reducing communication costs. To corroborate our theory, we also conduct in-depth numerical experiments to verify our findings, considering \textsf{Top-$K$} and \textsf{Sparsified-$K$} sparsification operators.
4:
5: \keywords{Federated Reinforcement Learning \and Communication Efficiency \and Direct Compression \and Error-feedback Compression.} % \and Top-$K$ Sparsification \and Sparsified-$K$ Sparsification
6: \end{abstract}
7: