abstract:6f484b4d998dc46b.tex

1: \begin{abstract}

2: We derive the mean squared error convergence rates of kernel density-based

3: plug-in estimators of mutual information measures between two multidimensional

4: random variables $\mathbf{X}$ and $\mathbf{Y}$ for two cases: 1)

5: $\X$ and $\Y$ are both continuous; 2) $\X$ is continuous and $\Y$

6: is discrete. Using the derived rates, we propose an ensemble estimator

7: of these information measures for the second case by taking a weighted

8: sum of the plug-in estimators with varied bandwidths. The resulting

9: ensemble estimator achieves the $1/N$ parametric convergence rate

10: when the conditional densities of the continuous variables are sufficiently

11: smooth. To the best of our knowledge, this is the first nonparametric

12: mutual information estimator known to achieve the parametric convergence

13: rate for this case, which frequently arises in applications (e.g.

14: variable selection in classification). The estimator is simple to

15: implement as it uses the solution to an offline convex optimization

16: problem and simple plug-in estimators. A central limit theorem is

17: also derived for the ensemble estimator. Ensemble estimators that

18: achieve the parametric rate are also derived for the first case ($\X$

19: and $\Y$ are both continuous) and another case 3) $\X$ and $\Y$

20: may have any mixture of discrete and continuous components.

21: \end{abstract}