9466ce18d951c4f3.tex
1: \begin{abstract}
2:  The ratio of two probability densities can be used for solving various machine learning tasks
3:  such as covariate shift adaptation (importance sampling),
4:  outlier detection (likelihood-ratio test), and feature selection
5:  (mutual information).
6:  Recently, several methods of directly estimating the density ratio 
7:  have been developed, e.g., kernel mean matching,
8:  maximum likelihood density ratio estimation,
9:  and least-squares density ratio fitting.
10:  In this paper, we consider a kernelized variant of 
11:  the least-squares method and investigate its theoretical properties
12:  from the viewpoint of the condition number using smoothed analysis
13:  techniques---the condition number
14:  of the Hessian matrix determines the convergence rate of optimization
15:  and the numerical stability. 
16:  We show that the kernel least-squares method has a smaller condition number
17:  than a version of kernel mean matching and other M-estimators, 
18:  implying that the kernel least-squares method has preferable numerical properties. 
19:  We further give an alternative formulation of the kernel least-squares estimator
20:  which is shown to possess an even smaller condition number. 
21:  We show that numerical studies meet our theoretical analysis. 
22: %\keywords{First keyword \and Second keyword \and More}
23: % \PACS{PACS code1 \and PACS code2 \and more}
24: % \subclass{MSC code1 \and MSC code2 \and more}
25: \end{abstract}
26: