1: \begin{abstract}
2: The ratio of two probability densities can be used for solving various machine learning tasks
3: such as covariate shift adaptation (importance sampling),
4: outlier detection (likelihood-ratio test), and feature selection
5: (mutual information).
6: Recently, several methods of directly estimating the density ratio
7: have been developed, e.g., kernel mean matching,
8: maximum likelihood density ratio estimation,
9: and least-squares density ratio fitting.
10: In this paper, we consider a kernelized variant of
11: the least-squares method and investigate its theoretical properties
12: from the viewpoint of the condition number using smoothed analysis
13: techniques---the condition number
14: of the Hessian matrix determines the convergence rate of optimization
15: and the numerical stability.
16: We show that the kernel least-squares method has a smaller condition number
17: than a version of kernel mean matching and other M-estimators,
18: implying that the kernel least-squares method has preferable numerical properties.
19: We further give an alternative formulation of the kernel least-squares estimator
20: which is shown to possess an even smaller condition number.
21: We show that numerical studies meet our theoretical analysis.
22: %\keywords{First keyword \and Second keyword \and More}
23: % \PACS{PACS code1 \and PACS code2 \and more}
24: % \subclass{MSC code1 \and MSC code2 \and more}
25: \end{abstract}
26: