method: LOGO2024-05-30

Authors: Hongen Liu, Di Sun, Jiahao Wang, Yi Liu, Gang Pan

Affiliation: College of Intelligence and Computing, Tianjin University;Tianjin University of Science and Technology; Baidu Inc.

Description: We propose a Language Collaboration and Glyph Perception Model, termed LOGO to enhance the performance of conventional text spotters through the integration of a synergy module. To achieve this goal, a language synergy classifier (LSC) is designed to explicitly discern text instances from background noise in the recognition stage. Besides, the glyph supervision and visual position mixture module are proposed to enhance the recognition accuracy of noisy text regions, and acquire more discriminative tracking features, respectively.