Results - Text in Videos - Robust Reading Competition

method: GoMatching2024-01-19

Authors: HeHaibin, YeMaoyuan, ZhangJing, LiuJuhua, TaoDacheng

Affiliation: Wuhan University-AI.INST

Description: We extend off-the-shelf image text spotter DeepSolo to video text spotter via long-short term matching module.

GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching

Source code

method: TransDETR2022-04-15

Authors: weijia

Affiliation: Zhejiang University&Kuaishou(MMU)

Email: weijiawu@zju.edu.cn

Description: A simple, but effective end-to-end video text DEtection, Tracking, and Recognition framework (TransDETR). TransDETR mainly includes two advantages: 1) Different from the explicit match paradigm in the adjacent frame, TransDETR tracks and recognizes each text implicitly by the different query termed text query over long-range temporal sequence (more than 7 frames). 2) TransDETR is the first end-to-end trainable video text spotting framework, which simultaneously addresses the three sub-tasks (e.g., text detection, tracking, recognition).

Wu, Weijia, Debing Zhang, Ying Fu, Chunhua Shen, Hong Zhou, Yuanqiang Cai, and Ping Luo. "End-to-End Video Text Spotting with Transformer." arXiv preprint arXiv:2203.10539 (2022).

Source code

Ranking Table

Description Paper Source Code

Date				Method	MOTA	MOTP	IDF1	Mostly Matched	Partially Matched	Mostly Lost
2024-01-19				GoMatching	72.04%	78.53%	80.11%	1002	205	160
2022-04-15				TransDETR	60.96%	74.61%	72.80%	644	323	400

Inactive evaluations

method: GoMatching2024-01-19

method: TransDETR2022-04-15

Ranking Table

Ranking Graphic