e88c8cb599a18097.tex
1: \begin{abstract}
2: 
3: Recently there has been significant interest in training 
4: machine-learning models at low precision: by reducing 
5: precision, one can reduce computation and communication by one order of magnitude. 
6: We examine training at reduced precision, both from a theoretical and practical 
7: perspective, and ask: 
8: {\em is it possible to \emph{train} models at end-to-end low 
9: precision with \emph{provable} guarantees? Can this 
10: lead to consistent order-of-magnitude speedups?}
11: We present a framework called ZipML to answer these questions.
12: For linear models, the answer is yes. We develop a simple 
13: framework based on one simple but novel strategy called double sampling. 
14: Our framework is able 
15: to execute training at low precision with no bias, 
16: guaranteeing convergence, whereas naive quantization 
17: would introduce significant bias. We validate our framework   
18: across a range of applications, and show that it enables an 
19: FPGA prototype that is up to $6.5\times$ faster 
20: than an implementation using full 32-bit precision.
21: We further develop a variance-optimal 
22: stochastic quantization
23: strategy and show that 
24: it can make a significant difference in a variety of settings. 
25: When applied to linear models together with 
26: double sampling, we save up to another 
27: $1.7\times$ in data movement compared with the uniform quantization.
28: When
29: training deep networks with quantized models, 
30: we achieve higher accuracy than the state-of-the-art XNOR-Net. 
31: Finally, we extend our framework through approximation to non-linear 
32: models, such as SVM. We show that, although using low-precision data induces bias, 
33: we can appropriately 
34: bound and control the bias. We find in practice {\em 8-bit} 
35: precision is often sufficient to converge to the correct solution. 
36: Interestingly, however, in practice we notice that our framework does not always outperform the naive rounding approach. We discuss this negative result in detail. 
37: 
38: 
39: \end{abstract}
40: