Results - Incidental Scene Text - Robust Reading Competition

method: TextFuseNet2020-07-31

Authors: Jian Ye, Zhe Chen, Juhua Liu and Bo Du

Affiliation: Wuhan University, The University of Sydney

Description: Arbitrary shape text detection in natural scenes is an extremely challenging task. Unlike existing text detection approaches that only perceive texts based on limited feature representations, we propose a novel framework, namely TextFuseNet, to exploit the use of richer features fused for text detection. More specifically, we propose to perceive texts from three levels of feature representations, i.e., character-, word- and global-level, and then introduce a novel text representation fusion technique to help achieve robust arbitrary text detection. The multi-level feature representation can adequately describe texts by dissecting them into individual characters while still maintaining their general semantics. TextFuseNet then collects and merges the texts’ features from different levels using a multi-path fusion architecture which can effectively align and fuse different representations. In practice, our proposed TextFuseNet can learn a more adequate description of arbitrary shapes texts, suppressing false positives and producing more accurate detection results. Our proposed framework can also be trained with weak supervision for those datasets that lack character-level annotations. Experiments on several datasets show that the proposed TextFuseNet achieves state-of-the-art performance. Specifically, we achieve an F-measure of 94.3% on ICDAR2013, 92.1% on ICDAR2015,87.1% on Total-Text and 86.6% on CTW-1500, respectively.

@inproceedings{ijcai2020-72, title={TextFuseNet: Scene Text Detection with Richer Fused Features}, author={Ye, Jian and Chen, Zhe and Liu, Juhua and Du, Bo}, booktitle={Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, {IJCAI-20}}, publisher={International Joint Conferences on Artificial Intelligence Organization}, pages={516--522}, year={2020} }

Source code

method: FOTS2018-01-22

Authors: Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan

Description: A unified end-to-end trainable Fast Oriented Text Spotting (FOTS) network for simultaneous detection and recognition, sharing computation and visual information among the two complementary tasks.

FOTS: Fast Oriented Text Spotting with a Unified Network, accepted by CVPR 2018

method: CRAFT2019-04-08

Authors: Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, and Hwalsuk Lee

Description: We propose a novel text detector called CRAFT. The proposed method effectively detects text area by exploring each character and affinity between characters. To overcome the lack of individual character level annotations, our framework exploits the pseudo character-level bounding boxes acquired by the learned interim model in a weakly-supervised manner.

Clova AI OCR Team, NAVER/LINE Corp.

Character Region Awareness for Text Detection (Accepted by CVPR 2019.)

Ranking Table

Description Paper Source Code

Date	Method	Recall	Precision	Hmean
2020-07-31	TextFuseNet	90.56%	93.96%	92.23%
2018-01-22	FOTS	87.92%	91.85%	89.84%
2019-04-08	CRAFT	84.26%	89.79%	86.93%
2017-09-13	PixelLink	83.77%	86.65%	85.19%
2018-01-04	crpn	80.69%	88.77%	84.54%
2019-07-15	stela	78.57%	88.70%	83.33%
2019-04-10	EAST-VGG16	81.27%	84.36%	82.79%
2020-08-14	DAL(multi-scale)	80.45%	84.35%	82.36%
2020-08-13	DAL	79.49%	83.68%	81.53%
2017-01-23	RRPN-4	77.13%	83.52%	80.20%
2016-10-28	RRPN-3	73.23%	82.17%	77.44%

Inactive evaluations

method: TextFuseNet2020-07-31

method: FOTS2018-01-22

method: CRAFT2019-04-08

Ranking Table

Ranking Graphic