abstract:e6ab1b74adab50af.tex

1: \begin{abstract}

2: Low-precision computation is often used to lower the time and energy cost of machine learning, and recently hardware accelerators have been developed to support it.

3: Still, it has been used primarily for inference---not training.

4: Previous low-precision training algorithms suffered from a fundamental tradeoff: as the number of bits of precision is lowered, quantization noise is added to the model, which limits statistical accuracy.

5: To address this issue, we describe a simple low-precision stochastic gradient descent variant called \sysname{}.

6: \sysname{} converges at the same theoretical rate as full-precision algorithms despite the noise introduced by using low precision throughout execution.

7: The key idea is to use SVRG to reduce gradient variance, and to combine this with a novel technique called \emph{bit centering} to reduce quantization error.

8: We show that on the CPU, \sysname{} can run up to $\numb{4} \times$ faster than full-precision SVRG and can match its convergence trajectory.

9: We implemented \sysname{} in TensorQuant, and show that it exceeds the validation performance of plain low-precision SGD on two deep learning tasks.

10: \end{abstract}

11: