abstract:e9daf9b0740fde6b.tex

1: \begin{abstract}

2:

3: Virtual try-on of eyeglasses involves placing eyeglasses of different shapes and styles onto a face image without physically trying them on.

4: While existing methods have shown impressive results, the variety of eyeglasses styles is limited and the interactions are not always intuitive or efficient.

5: To address these limitations, we propose a Text-guided Eyeglasses Manipulation method that allows for control of the eyeglasses shape and style based on a binary mask and text, respectively.

6: Specifically, we introduce a mask encoder to extract mask conditions and a modulation module that enables simultaneous injection of text and mask conditions.

7: This design allows for fine-grained control of the eyeglasses’ appearance based on both textual descriptions and spatial constraints.

8: Our approach includes a disentangled mapper and a decoupling strategy that preserves irrelevant areas, resulting in better local editing.

9: We employ a two-stage training scheme to handle the different convergence speeds of the various modality conditions, successfully controlling both the shape and style of eyeglasses.

10: Extensive comparison experiments and ablation analyses demonstrate the effectiveness of our approach in achieving diverse eyeglasses styles while preserving irrelevant areas.

11: \end{abstract}

12: