73cfd1d2ce8cdf2b.tex
1: \begin{abstract} 
2: 	In large-scale federated and decentralized learning, communication efficiency is one of the most challenging bottlenecks. While gossip communication---where agents can exchange information with their connected neighbors---is more cost-effective than communicating with the remote server, it often requires a greater number of communication rounds, especially for large and sparse networks. To tackle the trade-off, we examine the communication efficiency under a \textit{semi-decentralized} communication protocol, in which agents can perform both agent-to-agent and agent-to-server communication \textit{in a probabilistic manner}. We design a tailored communication-efficient algorithm over semi-decentralized networks, referred to as \alg, which inherits the robustness to data heterogeneity thanks to gradient tracking and allows multiple local updates for saving communication. We establish the convergence rate of \alg~for nonconvex problems and show that \alg~enjoys a linear speedup in terms of the number of agents and local updates. Our numerical results highlight the superior communication efficiency of \alg~and its resilience to data heterogeneity and various network topologies.
3:  \end{abstract}
4: