1: \begin{abstract}
2: We introduce a framework -- \Artemis~-- to tackle the problem of learning in a distributed or federated setting with communication constraints and device partial participation.
3: Several workers (randomly sampled) perform the optimization process using a central server to aggregate their computations. To alleviate the communication cost, \Artemis~allows to compresses the information sent in \emph{both directions} (from the workers to the server and conversely) combined with a memory mechanism.
4: It improves on existing algorithms that only consider unidirectional compression (to the server), or use very strong assumptions on the compression operator, and often do not take into account devices partial participation. We provide fast rates of convergence (linear up to a threshold) under weak assumptions on the stochastic gradients (noise's variance bounded only \textit{at optimal point}) in non-i.i.d.~setting, highlight the impact of memory for unidirectional and bidirectional compression, analyze Polyak-Ruppert averaging. We use convergence in distribution to obtain a \textit{lower bound} of the asymptotic variance that highlights practical limits of compression. And we provide experimental results to demonstrate the validity of our analysis.
5: \end{abstract}
6: