abstract:e88c8cb599a18097.tex

1: \begin{abstract}

2:

3: Recently there has been significant interest in training

4: machine-learning models at low precision: by reducing

5: precision, one can reduce computation and communication by one order of magnitude.

6: We examine training at reduced precision, both from a theoretical and practical

7: perspective, and ask:

8: {\em is it possible to \emph{train} models at end-to-end low

9: precision with \emph{provable} guarantees? Can this

10: lead to consistent order-of-magnitude speedups?}

11: We present a framework called ZipML to answer these questions.

12: For linear models, the answer is yes. We develop a simple

13: framework based on one simple but novel strategy called double sampling.

14: Our framework is able

15: to execute training at low precision with no bias,

16: guaranteeing convergence, whereas naive quantization

17: would introduce significant bias. We validate our framework

18: across a range of applications, and show that it enables an

19: FPGA prototype that is up to $6.5\times$ faster

20: than an implementation using full 32-bit precision.

21: We further develop a variance-optimal

22: stochastic quantization

23: strategy and show that

24: it can make a significant difference in a variety of settings.

25: When applied to linear models together with

26: double sampling, we save up to another

27: $1.7\times$ in data movement compared with the uniform quantization.

28: When

29: training deep networks with quantized models,

30: we achieve higher accuracy than the state-of-the-art XNOR-Net.

31: Finally, we extend our framework through approximation to non-linear

32: models, such as SVM. We show that, although using low-precision data induces bias,

33: we can appropriately

34: bound and control the bias. We find in practice {\em 8-bit}

35: precision is often sufficient to converge to the correct solution.

36: Interestingly, however, in practice we notice that our framework does not always outperform the naive rounding approach. We discuss this negative result in detail.

37:

38:

39: \end{abstract}

40: