9fa47fa70b12a282.tex
1: \begin{abstract}
2: Segmentation has emerged as a fundamental field of computer vision and natural language processing, which assigns a label to every pixel/feature to extract regions of interest from an image/text. To evaluate the performance of segmentation, the Dice and IoU metrics are used to measure the degree of overlap between the ground truth and the predicted segmentation. 
3: In this paper, we establish a theoretical foundation of segmentation with respect to the Dice/IoU metrics, including the Bayes rule and Dice-/IoU-calibration, analogous to classification-calibration or Fisher consistency in classification. 
4: We prove that the existing thresholding-based framework with most operating losses are not consistent with respect to the Dice/IoU metrics, and thus may lead to a suboptimal solution. 
5: To address this pitfall, we propose a novel consistent ranking-based framework, namely \textit{RankDice}/\textit{RankIoU}, inspired by plug-in rules of the Bayes segmentation rule. 
6: Three numerical algorithms with GPU parallel execution are developed to implement the proposed framework in large-scale and high-dimensional segmentation. 
7: We study statistical properties of the proposed framework. We show it is Dice-/IoU-calibrated, and its excess risk bounds and the rate of convergence are also provided.
8: The numerical effectiveness of \textit{RankDice/mRankDice} is demonstrated in various simulated examples and \textit{Fine-annotated CityScapes}, \textit{Pascal VOC} and \textit{Kvasir-SEG} datasets with state-of-the-art deep learning architectures.
9: \end{abstract}
10: