1: \begin{abstract}
2:
3: Gaussian processes (GPs) provide flexible distributions over functions, with inductive biases controlled by a kernel. However, in many applications
4: Gaussian processes can struggle with even moderate input
5: dimensionality. Learning a low dimensional projection can help alleviate this curse of dimensionality, but introduces many trainable hyperparameters, which can
6: be cumbersome, especially in the small data regime. We use additive sums of kernels for GP regression, where each kernel operates on a different random projection of its inputs.
7: Surprisingly,
8: we find that as the number of random projections increases, the predictive performance of this approach quickly converges to the performance of a kernel operating on the original full
9: dimensional inputs, over a wide range of data sets, \emph{even if we are projecting into a single dimension}. As a consequence, many problems can remarkably be reduced to one dimensional input spaces, without learning a transformation. We prove this convergence and its rate, and additionally propose a deterministic approach
10: that converges more quickly than purely random projections. Moreover, we demonstrate our approach can achieve faster inference and improved predictive accuracy for high-dimensional inputs compared to kernels in the original input space.
11:
12: \end{abstract}