abstract:cd39d529d5aacbf0.tex

1: \begin{abstract}

2: In recent years machine learning methods that nearly interpolate the data have achieved remarkable success. In many settings achieving  near-zero training error  leads to excellent test results. In this work we show how the mathematical and conceptual simplicity of  interpolation can be harnessed  to construct a framework for very efficient, scalable and accurate kernel machines.

3:

4: Our main innovation  is in constructing kernel machines that output solutions mathematically equivalent to those obtained using standard kernels, yet capable of  fully utilizing the available computing power of a parallel computational resource, such as GPU. Such utilization is key to strong performance since much of the computational resource capability is wasted by the standard iterative methods.  The computational resource and data adaptivity of our learned kernels is based on theoretical convergence bounds.

5:

6: The resulting algorithm, which we call \textit{EigenPro 2.0}, is accurate, principled and very fast. For example, using a single GPU,  training on ImageNet with $1.3\times 10^6$ data points and $1000$ labels  takes under an hour, while smaller datasets, such as MNIST, take seconds.

7: Moreover, as the parameters are chosen analytically, based on the theory, little tuning beyond  selecting the kernel and kernel parameter is needed, further facilitating the practical use of these methods.

8: \end{abstract}

9: