40532f0cd57a39ee.tex
1: \begin{abstract}
2:   Despite advances in scalable models, the inference tools used for Gaussian processes (GPs) have yet to fully capitalize on recent trends in machine learning hardware.
3:   In this paper, we present an efficient and general approach to GP inference based on Blackbox Matrix-Matrix multiplication (BBMM).
4:   BBMM inference uses a modified \emph{batched} version of the conjugate gradients algorithm to derive all terms required for training and inference in a single call.
5:   Adapting this algorithm to complex models simply requires a routine for efficient matrix-matrix multiplication with the kernel and its derivative.
6:   In addition, BBMM utilizes a specialized preconditioner that substantially speeds up convergence.
7:   In experiments, we show that BBMM efficiently utilizes GPU hardware, speeding up exact GP inference and many scalable approximations by up to $20$ times that of existing methods.
8: \end{abstract}
9: