Results - ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard

method: SANHL_v12019-04-30

Authors: In Description

Description: In this task, we first detect possible text lines. Then an ensembled recognition model is used to predict strings. The result is submitted by the researchers from South China University of Technology, Northwestern Polytechnical University, The University of Adelaide, Lenovo and Huawei. The researchers are Canjie Luo*, Yuliang Liu*(equal contribution), Qingxiang Lin, Hao Chen, Tianwei Wang, Lele Xie, Lu Yang, Shuaitao Zhang, Linjiang Zhang, Tong He, Canyu Xie, Chongyu Liu, Xiaoxue Chen, Jiapeng Wang, Xiangle Chen, Dezhi Peng, Weihong Ma, Peng Wang, Hui Li, Lianwen Jin, Chunhua Shen, Yaqiang Wu and Liangwei Wang.

单位：华南理工大学，阿德莱德大学，西北工业大学，联想，华为。
作者：罗灿杰*，刘禹良*，林庆祥，陈昊，王天玮，谢乐乐，杨路，张帅涛，张林江，贺通，谢灿宇，刘崇宇，陈晓雪，汪嘉鹏，陈向乐，彭德智，马伟洪，王鹏，李晖，金连文，沈春华，武亚强，王靓伟。

Omnidirectional Scene Text Detection with Sequential-free Box Discretization

Source code

method: ABCNetv22021-09-20

Authors: v2

Email: liu.yuliang@mail.scut.edu.cn

Description: Details can be found in the paper and the code. The code provides instructions to reimplement the results.

Y. Liu et al., "ABCNet v2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting," in IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2021.3107437.

Source code

method: CRAFT + TPS-ResNet v32019-04-30

Authors: Youngmin Baek, Chae Young Lee, Jeonghun Baek, Moonbin Yim, Sungrae Park, and Hwalsuk Lee

Description: [Detection part]
We propose a novel text detector called CRAFT. The proposed method effectively detects text area by exploring each character and affinity between characters. To overcome the lack of individual character level annotations, our framework exploits the pseudo character-level bounding boxes acquired by the learned interim model in a weakly-supervised manner.
[Recognition part]
We used Thin-plate-spline (TPS) based Spatial transformer network (STN) which normalizes the input text images, ResNet based feature extractor, BiLSTM, and attention mechanism.
This model was developed based on the analysis of scene text recognition modules.
See our paper and source code.
# CRAFT + TPS-ResNet v3 (test with small img size, and use all train data for recognition model)

Training Data
[Detection part]
We pre-trained our model CRAFT with SynthText, ICDAR 2013 FST, ICDAR 2017 MLT and finetuned it with some of the publicly released datasets of this year’s ICDAR challenge: ArT, MLT, and ReCTS.
[Recognition part]
At first, we generated the Chinese synthetic datasets by MJSynth and SynthText code, then pre-trained our model with the synthetic dataset and real dataset (ArT, LSVT, ReCTS, and RCTW). After that, we finetuned it with ReCTS data.

Character Region Awareness for Text Detection (Accepted by CVPR 2019.)

What is wrong with scene text recognition model comparisons? dataset and model analysis

Source code 2

Ranking Table

Description Paper Source Code

Date	Method	Recall	Precision	Hmean	1-NED
2019-04-30	SANHL_v1	93.86%	91.98%	92.91%	81.43%
2021-09-20	ABCNetv2	87.91%	92.89%	90.33%	63.94%
2019-04-30	CRAFT + TPS-ResNet v3	75.89%	78.44%	77.14%	41.68%

Inactive evaluations

method: SANHL_v12019-04-30

method: ABCNetv22021-09-20

method: CRAFT + TPS-ResNet v32019-04-30

Ranking Table

Ranking Graphic

Ranking Graphic