488bb411a231da75.tex
1: \begin{abstract}
2: Supervised learning needs a huge amount of labeled data,
3: which can be a big bottleneck under the situation where there is a privacy concern or labeling cost is high.
4: To overcome this problem, we propose a new weakly-supervised learning setting
5: where only \emph{similar (S)} data pairs (two examples belong to the same class) and \emph{unlabeled (U)} data points are needed
6: instead of fully labeled data, which is called \emph{SU classification}.
7: % SU classification is useful in various applications, such as speaker identification and protein function prediction.
8: We show that an unbiased estimator of the classification risk can be obtained only from SU data,
9: and the estimation error of its empirical risk minimizer achieves the optimal parametric convergence rate.
10: Finally, we demonstrate the effectiveness of the proposed method through experiments.
11: \end{abstract}
12: