c281d47580491103.tex
1: \begin{abstract} 
2: In this paper we address the problem of estimating the ratio $\frac{q}{p}$ where $p$ is a density function and $q$ is another density, or, more generally an arbitrary function.  Knowing or approximating this ratio is needed in various problems of inference and integration, in particular, when one needs to average  a function with respect to one probability distribution, given a sample from another. It is often referred as {\it importance sampling} in statistical inference and is  also closely related to the problem of {\it covariate shift} in transfer learning as well as to various MCMC methods.
3: It may also be useful for separating the underlying geometry of a space, say a manifold, from the density function defined on it. 
4: 
5: Our approach is based on reformulating the problem of estimating $\frac{q}{p}$  as an inverse problem in terms of an integral operator corresponding to a kernel, and thus reducing it to an integral equation, known as the Fredholm problem of the first kind.   This formulation, combined with the techniques of regularization and kernel methods, leads to a principled kernel-based framework for constructing algorithms and for analyzing them theoretically. 
6: 
7: The resulting family of algorithms (FIRE, for Fredholm Inverse Regularized Estimator) is flexible,  simple and  easy to implement. 
8: 
9: We provide detailed theoretical analysis including concentration bounds and convergence rates for the Gaussian kernel in the case of densities defined on $\R^d$, compact domains in $\R^d$ and smooth $d$-dimensional sub-manifolds of the Euclidean space.
10: 
11: We also show experimental results including applications to classification and semi-supervised learning within the covariate shift framework and demonstrate some encouraging experimental comparisons. We also show how the parameters of our algorithms can be chosen in a completely unsupervised manner.
12: \end{abstract}