4a1955fb186c84df.tex
1: \begin{abstract}
2: Scene text recognition is an important and challenging task in computer vision. However, most prior works focus on recognizing pre-defined words, while there are various out-of-vocabulary (OOV) words in real-world applications.
3: %Recognizing out-of-vocabulary (OOV) words remains a challenge, and some studies suggest distinguishing between in-vocabulary (IV) and OOV words. 
4: In this paper, we propose a novel open-vocabulary text recognition framework, Pseudo-OCR, to recognize OOV words. The key challenge in this task is the lack of OOV training data. To solve this problem, we first propose a pseudo label generation module that leverages character detection and image inpainting to produce substantial pseudo OOV training data from real-world images. Unlike previous synthetic data, our pseudo OOV data contains real characters and backgrounds to simulate real-world applications.
5: Secondly, to reduce noises in pseudo data, we present a semantic checking mechanism to filter semantically meaningful data. 
6: Thirdly, we introduce a quality-aware margin loss to boost the training with pseudo data. Our loss includes a margin-based part to enhance the classification ability, and a quality-aware part to penalize low-quality samples in both real and pseudo data.
7: %loss to increase inter-class distances and reduce intra-class distances, moreover the quality detector could decrease the low-quality image influence for training converge.introduce an approach that optimizes the geodesic distance margins to reduce the impact of noisy samples in training data on model convergence during training. A novel text quality adaptive mechanism has been introduced to dynamically adjust the margin of each class.
8: Extensive experiments demonstrate that our approach outperforms the state-of-the-art on eight datasets and achieves the first rank in the ICDAR2022 challenge.
9: %The code and models will be publicly available at \url{https://github.com/xuhuaren/Pseudo-OCR}.
10: \end{abstract}
11: