1: \begin{abstract}
2: In this paper, a scheme for communication-efficiency Deep Neural Network (DNN) training is proposed.
3: %
4: In particular, we consider a distributed learning scheme which encompasses three crucial components relevant to the implementation of such scheme over wireless networks: (i) gradient compression through floating point conversion, (ii) lossless gradient compression, and (iii) error feedback.
5: %
6: The interplay of these three components is carefully balanced to yield a robust and high-performing scheme for the transmission of DNN gradients over a rate-limited network.
7: %
8: Specifically, we consider a memory decay coefficient that controls the memory accumulation in the error feedback mechanism and argue that this coefficient, similarly to the learning rate, can be optimally tuned to improve convergence. {\color{ForestGreen}{Eduin: Do we really do this last part?}}
9: %
10: This schemes is shown to have better accuracy and improved stability, despite the reduced payload needed for gradient transmission.
11: %
12: The performance of the proposed scheme is investigated through numerical evaluations.
13: \end{abstract}
14: