abstract:488bb411a231da75.tex

1: \begin{abstract}

2: Supervised learning needs a huge amount of labeled data,

3: which can be a big bottleneck under the situation where there is a privacy concern or labeling cost is high.

4: To overcome this problem, we propose a new weakly-supervised learning setting

5: where only \emph{similar (S)} data pairs (two examples belong to the same class) and \emph{unlabeled (U)} data points are needed

6: instead of fully labeled data, which is called \emph{SU classification}.

7: % SU classification is useful in various applications, such as speaker identification and protein function prediction.

8: We show that an unbiased estimator of the classification risk can be obtained only from SU data,

9: and the estimation error of its empirical risk minimizer achieves the optimal parametric convergence rate.

10: Finally, we demonstrate the effectiveness of the proposed method through experiments.

11: \end{abstract}

12: