1: \begin{abstract} % Abstract of not more than 200 words.
2:
3: %GP
4: Gaussian Processes (GPs) are powerful kernelized methods for non-parameteric regression used in many applications.
5: However, their use is limited to a few thousand of training samples due to their cubic time complexity.
6: %sparse GP
7: In order to scale GPs to larger datasets, several sparse approximations
8: based on so-called inducing points have been proposed in the literature.
9: In this work we investigate the connection between a general class of sparse inducing point GP regression methods and Bayesian recursive estimation which enables Kalman Filter like updating for online learning. %Moreover, exploiting ideas from distributed estimation, we show how our approach can be distributed.
10: %
11: The majority of previous work has focused on the batch setting, in particular for learning the model parameters and the position of the inducing points,
12: here instead we focus on training with mini-batches.
13: By exploiting the Kalman filter formulation, we propose a novel approach that estimates such parameters by recursively propagating
14: the analytical gradients of the posterior over mini-batches of the data.
15: Compared to state of the art methods, our method keeps analytic updates for the mean and covariance of the posterior, thus reducing drastically the size of the optimization problem.
16: % results
17: We show that our method achieves faster convergence and superior performance compared to state of the art sequential Gaussian Process regression on synthetic GP as well as real-world data with up to a million of data samples.
18:
19: \end{abstract}
20: