8efa142365508a09.tex
1: \begin{abstract}
2: This paper proposes the Doubly Compressed Momentum-assisted Stochastic Gradient Tracking algorithm (\aname) for communication efficient decentralized learning. \aname~utilizes two compression steps per communication round as the algorithm tracks simultaneously the averaged iterate and stochastic gradient. Furthermore, \aname~incorporates a momentum based technique for reducing variances in the gradient estimates. We show that \aname~finds a solution $\avgtheta$ in $T$ iterations satisfying $\mathbb{E} [ \norm{ \nabla f(\avgtheta) }^2 ] = {\cal O}( 1 / T^{2/3} )$ for non-convex objective functions; and we provide competitive convergence rate guarantees for other function classes. Numerical experiments on synthetic and real datasets validate the efficacy of our algorithm.
3: \end{abstract}
4: