Results - Incidental Scene Text - Robust Reading Competition

method: TextFuseNet2020-07-31

Authors: Jian Ye, Zhe Chen, Juhua Liu and Bo Du

Affiliation: Wuhan University, The University of Sydney

Description: Arbitrary shape text detection in natural scenes is an extremely challenging task. Unlike existing text detection approaches that only perceive texts based on limited feature representations, we propose a novel framework, namely TextFuseNet, to exploit the use of richer features fused for text detection. More specifically, we propose to perceive texts from three levels of feature representations, i.e., character-, word- and global-level, and then introduce a novel text representation fusion technique to help achieve robust arbitrary text detection. The multi-level feature representation can adequately describe texts by dissecting them into individual characters while still maintaining their general semantics. TextFuseNet then collects and merges the texts’ features from different levels using a multi-path fusion architecture which can effectively align and fuse different representations. In practice, our proposed TextFuseNet can learn a more adequate description of arbitrary shapes texts, suppressing false positives and producing more accurate detection results. Our proposed framework can also be trained with weak supervision for those datasets that lack character-level annotations. Experiments on several datasets show that the proposed TextFuseNet achieves state-of-the-art performance. Specifically, we achieve an F-measure of 94.3% on ICDAR2013, 92.1% on ICDAR2015,87.1% on Total-Text and 86.6% on CTW-1500, respectively.

@inproceedings{ijcai2020-72, title={TextFuseNet: Scene Text Detection with Richer Fused Features}, author={Ye, Jian and Chen, Zhe and Liu, Juhua and Du, Bo}, booktitle={Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, {IJCAI-20}}, publisher={International Joint Conferences on Artificial Intelligence Organization}, pages={516--522}, year={2020} }

Source code

method: TH2020-01-22

Authors: Tsinghua University and Hyundai Motor Group AIRS Company

Email: Shanyu Xiao: xiaosy19@mails.tsinghua.edu.cn

Description: We have built an end-to-end scene text spotter based on Mask R-CNN & Transformer. The ResNet-50 backbone and multiscale training/testing are used.

method: JDAI2019-08-13

Authors: Jingyang Lin, Jiajia Geng, Rongfeng Lai

Description: We are from JDAI and Sun Yat-Sen University. It is a strong scene text detection baseline built upon Mask R-CNN architecture.

Ranking Table

Description Paper Source Code

Date	Method	Recall	Precision	Hmean
2020-07-31	TextFuseNet	90.56%	93.96%	92.23%
2020-01-22	TH	89.46%	94.03%	91.69%
2019-08-13	JDAI	90.85%	92.50%	91.67%
2018-07-03	Baidu VIS v2	88.11%	94.04%	90.98%
2018-01-31	Alibaba-PAI	87.34%	93.84%	90.47%
2018-01-22	FOTS	87.92%	91.85%	89.84%
2018-11-15	Pixel-Anchor(Multiscale)	86.95%	92.28%	89.54%
2018-03-05	HoText_v1	83.58%	96.34%	89.51%
2017-09-15	Baidu VIS	83.39%	93.62%	88.21%
2018-06-26	SPCNet_TongJi & UESTC (single scale)	86.71%	88.94%	87.81%
2019-05-17	SEG-PIXEL-PAN (single-scale)	85.32%	90.22%	87.70%
2018-11-15	Pixel-Anchor(single scale)	87.05%	88.32%	87.68%
2018-05-18	PSENet_NJU_ImagineLab (single-scale)	85.22%	89.30%	87.21%
2018-12-03	CV_OCR_NOOB(single-scale)	82.96%	91.55%	87.04%
2019-04-08	CRAFT	84.26%	89.79%	86.93%
2017-07-12	Tencent-DPPR	81.80%	90.71%	86.03%
2017-07-04	Baidu IDL v3	81.99%	89.82%	85.73%
2017-05-13	SRC-B-MachineLearningLab-v3	82.81%	88.66%	85.64%
2020-04-18	MMLab-PolarMask++(Single Scale)	83.53%	87.36%	85.40%
2017-09-13	PixelLink	83.77%	86.65%	85.19%
2018-12-02	EPTN-SJTU	80.93%	89.13%	84.83%
2018-01-04	crpn	80.69%	88.77%	84.54%
2017-10-22	FTDN-SJTU-v2	80.93%	87.69%	84.18%
2017-09-03	CCFLAB_FTSN	80.07%	88.65%	84.14%
2017-02-17	NLPR-CASIA	82.76%	84.76%	83.75%
2018-08-09	YY-tl_final	82.43%	84.63%	83.51%
2017-09-04	FTDN-SJTU	80.55%	86.59%	83.46%
2019-07-15	stela	78.57%	88.70%	83.33%
2019-04-10	EAST-VGG16	81.27%	84.36%	82.79%
2018-01-10	HoText_v0	79.20%	86.40%	82.64%
2020-08-14	DAL(multi-scale)	80.45%	84.35%	82.36%
2020-08-13	DAL	79.49%	83.68%	81.53%
2017-07-31	EAST reimplemention with resnet 50	77.32%	84.66%	80.83%
2017-01-23	RRPN-4	77.13%	83.52%	80.20%
2016-10-28	RRPN-3	73.23%	82.17%	77.44%
2017-01-19	SRC-B-MachineLearningLab	69.86%	86.11%	77.14%
2017-02-12	SSTD	73.86%	80.23%	76.91%
2017-07-14	zju_cvte_seglink_512	72.85%	80.22%	76.36%
2022-01-24	another_segText	74.00%	76.62%	75.29%
2016-10-25	Baidu IDL v2	72.75%	77.41%	75.01%
2015-11-11	Megvii-Image++	56.96%	72.40%	63.76%
2019-07-23	std++(single-scale)	56.67%	71.64%	63.28%
2016-11-08	CTPN	51.56%	74.22%	60.85%
2018-12-27	fast_ret_sh_02	54.07%	65.87%	59.39%
2017-09-21	UCAS_CMVT3	49.16%	66.38%	56.49%
2015-04-03	Stradvision-2	36.74%	77.46%	49.84%
2015-04-02	StradVision-1	46.27%	53.39%	49.57%
2015-04-02	NJU_Text_Version4	35.82%	72.73%	48.00%
2015-04-01	NJU Text (Version2)	36.25%	70.44%	47.87%
2015-03-31	AJOU	46.94%	47.26%	47.10%
2015-03-30	NJU_Text_Version1	38.32%	56.33%	45.62%
2015-03-31	NJU_Text_Version2	37.46%	54.14%	44.28%
2017-10-12	TextFCN V2	37.02%	54.19%	43.99%
2015-04-02	NJU_Text_Version5	37.84%	51.41%	43.59%
2015-04-02	HUST MCLAB (VER3.0)	37.79%	44.00%	40.66%
2015-04-02	HUST_MCLAB_VER1.0	34.81%	47.47%	40.17%
2015-04-02	HUST_MCLAB_VER2.0	34.09%	46.49%	39.33%
2015-04-02	HUST_MCLAB_VER.0	34.09%	46.49%	39.33%
2015-04-02	Deep2Text-MO	32.11%	49.59%	38.98%
2015-04-03	CNN Proposal Based MSER	34.42%	34.71%	34.57%
2015-04-03	TD-IMU	25.28%	34.56%	29.20%
2015-04-03	TextCatcher-2 (LRDE)	34.81%	24.91%	29.04%

Inactive evaluations

method: TextFuseNet2020-07-31

method: TH2020-01-22

method: JDAI2019-08-13

Ranking Table

Ranking Graphic