abstract:010ce9fd8b822d2a.tex

1: \begin{abstract}

2: The communication overhead  has become a significant bottleneck in data-parallel network with the increasing of model size and data samples.

3: %

4: In this work, we propose a new algorithm LPC-SVRG with quantized gradients

5: %

6: and its acceleration ALPC-SVRG to effectively reduce the communication complexity while maintaining the same convergence as the unquantized algorithms.

7: %

8: Specifically, we formulate the heuristic gradient clipping technique within the quantization scheme and

9: %

10: show that  unbiased quantization methods  in related works \cite{alistarh2017qsgd,wen2017terngrad,zhang2017zipml} are special cases of ours.

11: %

12: We introduce \emph{double sampling} in the accelerated algorithm ALPC-SVRG to fully combine the gradients of full-precision and low-precision, and then achieve acceleration with fewer communication overhead.

13: %

14: Our analysis focuses on the nonsmooth composite problem, which makes our algorithms more general.

15: %

16: The experiments on linear models and deep neural networks validate the effectiveness of our algorithms.

17: \end{abstract}

18: