1: \begin{abstract}
2: %which allows joint training over distributed data sets and computing resources without any disclosure of private data at edge devices,
3: %Federated Learning (FL), has become a new paradigm for fast intelligent acquisition at the network edge. Communication is a critical enabler of scaling up these benefits due to significant amount of model information required to be exchanged among edge devices. In this paper, we consider a network of wireless devices sharing common radio interface for the purpose of collaboratively training a machine learning model. Each device holds a generally distinct training set, and wireless communication typically takes place over a connectivity graph in a Device-to-Device (D2D) manner. In the ideal case in which all devices within communication range can communicate simultaneously and noiselessly, a standard protocol with guaranteed convergence of optimality for a global empirical risk minimization problem under assumptions of convexity and connectivity is Decentralized Stochastic Gradient Descent (DSGD).
4: %DSGD in general integrates local SGD steps with periodic consensus-average updates that require communications among neighboring devices.
5: %To apply DSGD with the presence of path loss, blockages, fading and mutual interference, device scheduling policies and physical-layer transmission schemes are properly revisited. Specifically, the proposed scheduling policies are aimed for avoiding mutual interference in the digital implementation, and for repeating the benefit of {\em over-the-air computing} in the analog implementation. The physical-layer transmission schemes accounting for compression are also proposed for digital and analog implementations, respectively, with the latter leveraging sparsity-based signal recovery.
6: %Both implementations are verified against various benchmarks by simulations, revealing different scenarios when digital or analog implementation is preferred.
7: %\end{abstract}
8: