2ceede6a1141e8ef.tex
1: \begin{abstract}
2: 	Decentralized training has been  actively studied in recent years. Although  a wide variety of methods have been proposed, yet the decentralized  momentum SGD  method is still underexplored. In this paper, we propose a novel periodic decentralized momentum SGD method, which employs the momentum schema  and periodic communication for decentralized training. With these two strategies, as well as the topology of the decentralized training system, the theoretical convergence analysis of our proposed  method is difficult. We address this challenging  problem and  provide the condition  under which our proposed method can achieve the linear speedup regarding the number of  workers. Furthermore,  we also  introduce a communication-efficient variant to  reduce  the communication cost in each communication round. The condition for achieving the linear  speedup  is also provided for this  variant.  To the  best of our  knowledge, these two  methods are all the first ones  achieving these theoretical results in  their corresponding domain. We conduct extensive experiments to  verify the performance  of our proposed  two methods, and both of  them  have shown  superior performance over existing methods. 
3: \end{abstract}
4: