1: \begin{definition}[]
2: Given the expression for the bias in Theorem \ref{bias_theorem}, the ensemble estimation technique proposed in \cite{Kevin16} can be applied to improve the convergence rate of the MI estimator \eqref{est_def}. Assume that the densities in \textbf{A3} have continuous bounded derivatives up to the order $q$, where $q\geq d$.
3: Let $\mathcal{T}:=\{t_1,...,t_T\}$ be a set of index values with $t_i<c$, where $c>0$ is a constant. Let $\epsilon(t):=tN^{-1/2d}$. For a given set of weights $w(t)$ the weighted ensemble estimator is then defined as
4: \begin{align}\label{EDGE_def}
5: \widehat{I}_w:=\sum_{t\in \mathcal{T}}w(t)\widehat{I}_{\epsilon(t)},
6: \end{align}
7: where $\widehat{I}_{\epsilon(t)}$ is the mutual information estimator with the parameter $\epsilon(t)$. Using \eqref{bias_terms}, for $q>0$ the bias of the weighted ensemble estimator \eqref{EDGE_def} takes the form
8: \begin{equation}
9: \mathbb{B}(\hat{I}_w) = \sum_{i=1}^q Ci N^{-\frac{i}{2d}} \sum_{t\in \mathcal{T}} w(t) t^{i} +O\of{{\frac{t^d}{N^{1/2}}}}+O\of{\frac{1}{N\epsilon^d}}
10: \label{Ensemble_bias}
11: \end{equation}
12:
13: Given the form (\ref{Ensemble_bias}), as long as $T\geq q$, we can select the weights $w(t)$ to force to zero the slowly decaying terms in (\ref{Ensemble_bias}), i.e. $\sum_{t\in \tau} w(t)t^{i/d}=0$ subject to the constraint that$\sum_{t\in \tau} w(t)=1$. However, $T$ should be strictly greater than $q$ in order to control the variance, which is upper bounded by the euclidean norm squared of the weights $\omega$. In particular we have the following theorem (the proof is given in Appendix C):
14:
15: \begin{theorem} \label{ensemble_theorem}
16: For $T>d$ let $w_0$ be the solution to:
17: \begin{align}
18: \min_w &\qquad \|w\|_2 \nonumber\\
19: \textit{subject to} &\qquad \sum_{t\in \mathcal{T}}w(t)=1, \nonumber\\
20: &\qquad \sum_{t\in \mathcal{T}}w(t)t^{i}=0, i\in \mathbb{N}, i\leq d.
21: \end{align}
22: Then the MSE rate of the ensemble estimator $\widehat{I}_{w_0}$ is $O(1/N)$.
23: \end{theorem}
24:
25: \end{definition}
26: