method: GoMatching2024-01-19

Authors: HeHaibin, YeMaoyuan, ZhangJing, LiuJuhua, TaoDacheng

Affiliation: Wuhan University-AI.INST

Description: We extend off-the-shelf image text spotter DeepSolo to video text spotter via long-short term matching module.

method: LOGO2024-05-30

Authors: Hongen Liu, Di Sun, Jiahao Wang, Yi Liu, Gang Pan

Affiliation: College of Intelligence and Computing, Tianjin University;Tianjin University of Science and Technology; Baidu Inc.

Description: We propose a Language Collaboration and Glyph Perception Model, termed LOGO to enhance the performance of conventional text spotters through the integration of a synergy module. To achieve this goal, a language synergy classifier (LSC) is designed to explicitly discern text instances from background noise in the recognition stage. Besides, the glyph supervision and visual position mixture module are proposed to enhance the recognition accuracy of noisy text regions, and acquire more discriminative tracking features, respectively.

method: h&h_lab2021-08-02

Authors: HUST_VLRGROUP(Dian Jin) & HUAWEI_CLOUD_EI(Jing Wang, Shenggao Zhu)

Affiliation: h&h_lab

Description: This method is a technical method specially designed for the competition. Specifically, we try to employ the textual transcription of the word in the video to distinguish different text objects in the video. The use of textual transcription of the word boosts the performance of video text tracking a lot but also requires the supervision of the recognition results of the text instances. It’s a rude but useful method. For task 4, we further apply the post-processing method to get a more accurate recognition result for each text object after the text trajectory being generated.

Ranking Table

Description Paper Source Code
DateMethodMOTAMOTPIDF1Mostly MatchedPartially MatchedMostly Lost
2024-01-19GoMatching72.04%78.53%80.11%1002205160
2024-05-30LOGO68.07%73.00%75.85%795294278
2021-08-02h&h_lab63.76%77.78%71.08%673381313
2022-04-15TransDETR60.96%74.61%72.80%644323400
2020-02-26HIK_OCR52.98%74.88%61.85%618253487
2016-04-13Megvii-Image++19.11%67.28%34.97%134397836
2015-04-02USTB_TexVideo15.57%68.47%28.18%122382778
2015-04-02Deep2Text I (Video)14.35%68.75%32.05%200296786
2015-04-02USTB_TexVideo II-213.24%66.61%21.25%84327859
2015-04-17Stradvision-18.98%70.20%31.95%122432813
2015-04-02USTB_TexVideo II-15.64%58.76%19.74%111263872
2015-03-30Baseline-TextSpotter0.00%0.00%0.00%

Ranking Graphic