abstract:117cafe1b5da0222.tex

1: \begin{abstract}%   <- trailing '%' for backward compatibility of .sty file

2: In this work we consider a problem of multi-label classification, where each instance is associated with some binary vector.

3: Our focus is to find a classifier which minimizes false negative discoveries under constraints.

4: Depending on the considered set of constraints we propose plug-in methods and provide non-asymptotic analysis under margin type assumptions.

5: Specifically, we analyze two particular examples of constraints that promote sparse predictions: in the first one, we focus on classifiers with $\ell_0$-type constraints and in the second one, we address classifiers with bounded false positive discoveries.

6: Both formulations lead to different Bayes rules and, thus, different plug-in approaches.

7: % Recent empirical studies have shown that the plug-in approach performs particularly well in several contexts of multi-label classification.

8: % However, the theoretical study of these methods is usually limited to consistency results.

9: % In contrast, we provide a non-asymptotic analysis for such methods by establishing excess risk upper bounds.

10: The first considered scenario is the popular multi-label top-$K$ procedure: a label is predicted to be relevant if its score is among the $K$ largest ones.

11: For this case, we provide an excess risk bound that achieves so called ``fast'' rates of convergence under a generalization of the margin assumption to this settings.

12: The second scenario differs significantly from the top-$K$ settings, as the constraints are distribution dependent.

13: We demonstrate that in this scenario the almost sure control of false positive discoveries is impossible without extra assumptions.

14: To alleviate this issue we propose a sufficient condition for the consistent estimation and provide non-asymptotic upper bound.

15: \end{abstract}

16: