1a8506bf1fd3275f.tex
1: \begin{abstract}
2: Existing automatic 3D image segmentation methods usually fail to meet the clinic use. 
3: Many studies have explored an interactive strategy to improve the image segmentation performance by iteratively incorporating user hints. 
4: However, the dynamic process for successive interactions is largely ignored. 
5: We here propose to model the dynamic process of iterative interactive image segmentation as a Markov decision process (MDP) and solve it with reinforcement learning (RL). 
6: Unfortunately, it is intractable to use single-agent RL for voxel-wise prediction due to the large exploration space. 
7: To reduce the exploration space to a tractable size, we treat each voxel as an agent with a shared voxel-level behavior strategy so that it can be solved with multi-agent reinforcement learning. 
8: An additional advantage of this multi-agent model is to capture the dependency among voxels for segmentation task. 
9: Meanwhile, to enrich the information of previous segmentations, we reserve the prediction uncertainty in the state space of MDP and derive an adjustment action space leading to a more precise and finer segmentation.
10: In addition, to improve the efficiency of exploration, we design a relative cross-entropy gain-based reward to update the policy in a constrained direction.
11: Experimental results on various medical datasets have shown that our method significantly outperforms existing state-of-the-art methods, with the advantage of fewer interactions and a faster convergence.
12: \end{abstract}
13: