abstract:77bd7d836e8e9664.tex

1: \begin{abstract}

2:   \noindent

3:

4: We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier.  This allows us to find the asymptotically optimal vector of non-negative weights, which has a rather simple form.  We show that the ratio of the regret of this classifier to that of an unweighted $k$-nearest neighbour classifier depends asymptotically only on the dimension $d$ of the feature vectors, and not on the underlying populations.  The improvement is greatest when $d=4$, but thereafter decreases as $d \rightarrow \infty$.  The popular bagged nearest neighbour classifier can also be regarded as a weighted nearest neighbour classifier, and we show that its corresponding weights are somewhat suboptimal when $d$ is small (in particular, worse than those of the unweighted $k$-nearest neighbour classifier when $d=1$), but are close to optimal when $d$ is large.  Finally, we argue that improvements in the rate of convergence are possible under stronger smoothness assumptions, provided we allow negative weights.

5:

6:

7:   \bigskip

8:   \noindent {\bf Key words:} Bagging, classification, nearest neighbours, weighted nearest neighbour classifiers.

9:

10: \end{abstract}

11: