abstract:73195680e6328faf.tex

1: \begin{abstract}

2:   We study the minimax settings of binary classification with $\text{F}$-score under the $\beta$-smoothness assumptions on the regression function $\eta(x) = \Prob(Y = 1| X = x)$ for $x \in \bbR^d$.

3:     We propose a classification procedure which under the $\alpha$-margin assumption achieves the rate $\bigO(n^{-(1 + \alpha)\beta / (2\beta + d)})$ for the excess $\text{F}$-score.

4:     In this context, the Bayes optimal classifier for the $\text{F}$-score can be obtained by thresholding the aforementioned regression function $\eta$ on some level $\theta^*$ to be estimated.

5:     The proposed procedure is performed in a semi-supervised manner, that is, for the estimation of the regression function we use a labeled dataset of size $n \in \bbN$ and for the estimation of the optimal threshold $\theta^*$ we use an unlabeled dataset of size $N \in \bbN$.

6:     Interestingly, the value of $N \in \bbN$ does not affect the rate of convergence, which indicates that it is ``harder'' to estimate the regression function $\eta$ than the optimal threshold $\theta^*$.

7:     This further implies that the binary classification with $\text{F}$-score behaves similarly to the standard settings of binary classification.

8:     Finally, we show that the rates achieved by the proposed procedure are optimal in the minimax sense up to a constant factor.

9: \end{abstract}