1: \begin{abstract}
2: We have developed a parallel algorithm for radial basis function (\rbf) interpolation that exhibits \bigON\ complexity,
3: requires \bigON\ storage, and scales excellently up to a thousand processes. The algorithm uses a \gmres iterative
4: solver with a restricted additive Schwarz method (\rasm) as a preconditioner and a fast matrix-vector
5: algorithm. Previous fast \rbf methods\,---\,achieving at most $\mathcal{O}(N\log N)$ complexity\,---\,were developed
6: using multiquadric and polyharmonic basis functions. In contrast, the present method uses Gaussians with a small
7: variance (a common choice in particle methods for fluid simulation, our main target application). The fast decay of the Gaussian basis function allows rapid convergence of the iterative solver even when the subdomains
8: in the \rasm are very small. The present method was implemented in parallel using the \petsc library (developer
9: version). Numerical experiments demonstrate its capability in problems of \rbf interpolation with more than $50$
10: million data points, timing at $106$ seconds ($19$ iterations for an error tolerance of $10^{-15}$) on 1024 processors
11: of a Blue Gene/L (700 MHz PowerPC processors). The parallel code is freely available in the open-source model.
12: \end{abstract}
13: