abstract:78d2d98dc76a2094.tex

1: \begin{abstract}

2: For supervised classification problems, this paper considers estimating the query's label probability through local regression using observed covariates.

3: Well-known nonparametric kernel smoother and $k$-nearest neighbor ($k$-NN) estimator, which take label average over a ball around the query, are consistent but asymptotically biased particularly for a large radius of the ball.

4: To eradicate such bias, local polynomial regression~(LPoR) and multiscale $k$-NN~(MS-$k$-NN) learn the bias term by local regression around the query and extrapolate it to the query itself.

5: However, their theoretical optimality has been shown for the limit of the infinite number of training samples.

6: For correcting the asymptotic bias with fewer observations, this paper proposes a \emph{local radial regression~(LRR)} and its logistic regression variant called \emph{local radial logistic regression~(LRLR)}, by combining the advantages of LPoR and MS-$k$-NN. The idea is quite simple: we fit the local regression to observed labels by taking only the radial distance as the explanatory variable and then extrapolate the estimated label probability to zero distance.

7: The usefulness of the proposed method is shown theoretically and experimentally. We prove the convergence rate of the $L^2$ risk for LRR with reference to MS-$k$-NN, and our numerical experiments, including real-world datasets of daily stock indices, demonstrate that LRLR outperforms LPoR and MS-$k$-NN.

8: \end{abstract}

9: