9eb7dab51278a686.tex
1: \begin{abstract} 
2: 
3: Gaussian processes (GPs) provide flexible distributions over functions, with inductive biases controlled by a kernel. However, in many applications 
4: Gaussian processes can struggle with even moderate input 
5: dimensionality. Learning a low dimensional projection can help alleviate this curse of dimensionality, but introduces many trainable hyperparameters, which can 
6: be cumbersome, especially in the small data regime. We use additive sums of kernels for GP regression, where each kernel operates on a different random projection of its inputs. 
7: Surprisingly, 
8: we find that as the number of random projections increases, the predictive performance of this approach quickly converges to the performance of a kernel operating on the original full 
9: dimensional inputs, over a wide range of data sets, \emph{even if we are projecting into a single dimension}. As a consequence, many problems can remarkably be reduced to one dimensional input spaces, without learning a transformation. We prove this convergence and its rate, and additionally propose a deterministic approach 
10: that converges more quickly than purely random projections. Moreover, we demonstrate our approach can achieve faster inference and improved predictive accuracy for high-dimensional inputs compared to kernels in the original input space. 
11: 
12: \end{abstract}