abstract:9466ce18d951c4f3.tex

1: \begin{abstract}

2:  The ratio of two probability densities can be used for solving various machine learning tasks

3:  such as covariate shift adaptation (importance sampling),

4:  outlier detection (likelihood-ratio test), and feature selection

5:  (mutual information).

6:  Recently, several methods of directly estimating the density ratio

7:  have been developed, e.g., kernel mean matching,

8:  maximum likelihood density ratio estimation,

9:  and least-squares density ratio fitting.

10:  In this paper, we consider a kernelized variant of

11:  the least-squares method and investigate its theoretical properties

12:  from the viewpoint of the condition number using smoothed analysis

13:  techniques---the condition number

14:  of the Hessian matrix determines the convergence rate of optimization

15:  and the numerical stability.

16:  We show that the kernel least-squares method has a smaller condition number

17:  than a version of kernel mean matching and other M-estimators,

18:  implying that the kernel least-squares method has preferable numerical properties.

19:  We further give an alternative formulation of the kernel least-squares estimator

20:  which is shown to possess an even smaller condition number.

21:  We show that numerical studies meet our theoretical analysis.

22: %\keywords{First keyword \and Second keyword \and More}

23: % \PACS{PACS code1 \and PACS code2 \and more}

24: % \subclass{MSC code1 \and MSC code2 \and more}

25: \end{abstract}

26: