abstract:c281d47580491103.tex

1: \begin{abstract}

2: In this paper we address the problem of estimating the ratio $\frac{q}{p}$ where $p$ is a density function and $q$ is another density, or, more generally an arbitrary function.  Knowing or approximating this ratio is needed in various problems of inference and integration, in particular, when one needs to average  a function with respect to one probability distribution, given a sample from another. It is often referred as {\it importance sampling} in statistical inference and is  also closely related to the problem of {\it covariate shift} in transfer learning as well as to various MCMC methods.

3: It may also be useful for separating the underlying geometry of a space, say a manifold, from the density function defined on it.

4:

5: Our approach is based on reformulating the problem of estimating $\frac{q}{p}$  as an inverse problem in terms of an integral operator corresponding to a kernel, and thus reducing it to an integral equation, known as the Fredholm problem of the first kind.   This formulation, combined with the techniques of regularization and kernel methods, leads to a principled kernel-based framework for constructing algorithms and for analyzing them theoretically.

6:

7: The resulting family of algorithms (FIRE, for Fredholm Inverse Regularized Estimator) is flexible,  simple and  easy to implement.

8:

9: We provide detailed theoretical analysis including concentration bounds and convergence rates for the Gaussian kernel in the case of densities defined on $\R^d$, compact domains in $\R^d$ and smooth $d$-dimensional sub-manifolds of the Euclidean space.

10:

11: We also show experimental results including applications to classification and semi-supervised learning within the covariate shift framework and demonstrate some encouraging experimental comparisons. We also show how the parameters of our algorithms can be chosen in a completely unsupervised manner.

12: \end{abstract}