aae69d34c2739664.tex
1: \begin{abstract}
2: In this paper we revisit the classical method of partitioning classification and study its convergence rate under relaxed conditions, both for observable (non-privatised) 
3: and for privatised data. 
4: Let the feature vector $X$ take values in $\R^d$ and denote its label by $Y$.
5: Previous results on the partitioning classifier worked with the strong density assumption,  which is restrictive, as we demonstrate through simple examples. 
6: We assume that the distribution of $X$ is a mixture of an absolutely continuous and a discrete distribution, such that the absolutely continuous component is concentrated to a $d_a$ dimensional subspace.
7: Here, we study the problem under much milder assumptions: in addition to the standard Lipschitz and margin conditions, a novel characteristic of the absolutely continuous component is introduced, by which the exact convergence rate of the classification error probability is calculated, both for the binary and for the multi-label cases.
8: Interestingly, this rate of convergence depends only on the intrinsic dimension $d_a$.
9: 
10: The privacy constraints mean that the  data $(X_1,Y_1), \dots ,(X_n,Y_n)$ cannot be directly observed, and the classifiers are functions of the randomised outcome of a suitable local differential privacy  mechanism. 
11: The statistician is free to choose the form of this privacy mechanism, and here we add Laplace distributed noises to  the discontinuations of all possible locations of the feature vector $X_i$ and to its  label $Y_i$. 
12: Again, tight upper bounds on the rate of convergence of the classification error probability are derived, without the strong density assumption, such that this rate depends on $2\,d_a$.
13: \end{abstract}
14: