e9daf9b0740fde6b.tex
1: \begin{abstract}
2: 
3: Virtual try-on of eyeglasses involves placing eyeglasses of different shapes and styles onto a face image without physically trying them on. 
4: While existing methods have shown impressive results, the variety of eyeglasses styles is limited and the interactions are not always intuitive or efficient. 
5: To address these limitations, we propose a Text-guided Eyeglasses Manipulation method that allows for control of the eyeglasses shape and style based on a binary mask and text, respectively. 
6: Specifically, we introduce a mask encoder to extract mask conditions and a modulation module that enables simultaneous injection of text and mask conditions. 
7: This design allows for fine-grained control of the eyeglasses’ appearance based on both textual descriptions and spatial constraints.
8: Our approach includes a disentangled mapper and a decoupling strategy that preserves irrelevant areas, resulting in better local editing. 
9: We employ a two-stage training scheme to handle the different convergence speeds of the various modality conditions, successfully controlling both the shape and style of eyeglasses. 
10: Extensive comparison experiments and ablation analyses demonstrate the effectiveness of our approach in achieving diverse eyeglasses styles while preserving irrelevant areas.
11: \end{abstract}
12: