1: \begin{abstract}
2: Estimation of the $\phi$-divergence between two unknown probability distributions using empirical data is a fundamental problem in
3: information theory and statistical learning. We consider a multi-variate generalization of the data dependent partitioning method
4: for estimating divergence between the two unknown distributions.
5: Under the assumption that the distribution satisfies a power law of decay, we provide a convergence rate result for this method on the
6: number of samples and hyper-rectangles required to ensure the estimation error is bounded by a given level with a given probability.
7: \end{abstract}
8: