abstract:efa30ee6231ac41e.tex

1: \begin{abstract}

2: Fully quantized training (FQT), which uses low-bitwidth hardware by quantizing the activations, weights, and gradients of a neural network model, is a promising approach to accelerate the training of deep neural networks.

3: One major challenge with FQT is the lack of theoretical understanding, in particular of how gradient quantization impacts convergence properties.

4: In this paper, we address this problem by presenting a statistical framework for analyzing  FQT algorithms.

5: We view the quantized gradient of FQT as a stochastic estimator of its full precision counterpart, a procedure known as quantization-aware training (QAT).

6: We show that the FQT gradient is an unbiased estimator of the QAT gradient, and we discuss the impact of gradient quantization on its variance.

7: Inspired by these theoretical results, we develop two novel gradient quantizers, and we show that these have smaller variance than the existing per-tensor quantizer.

8: For training ResNet-50 on ImageNet, our 5-bit block Householder quantizer achieves only 0.5\% validation accuracy loss relative to QAT, comparable to the existing INT8 baseline. Our code is publicly available at \url{https://github.com/cjf00000/StatQuant}.

9: \end{abstract}

10: