Results - Focused Scene Text - Robust Reading Competition

method: TextFuseNet2020-07-31

Authors: Jian Ye, Zhe Chen, Juhua Liu and Bo Du

Affiliation: Wuhan University, The University of Sydney

Description: Arbitrary shape text detection in natural scenes is an extremely challenging task. Unlike existing text detection approaches that only perceive texts based on limited feature representations, we propose a novel framework, namely TextFuseNet, to exploit the use of richer features fused for text detection. More specifically, we propose to perceive texts from three levels of feature representations, i.e., character-, word- and global-level, and then introduce a novel text representation fusion technique to help achieve robust arbitrary text detection. The multi-level feature representation can adequately describe texts by dissecting them into individual characters while still maintaining their general semantics. TextFuseNet then collects and merges the texts’ features from different levels using a multi-path fusion architecture which can effectively align and fuse different representations. In practice, our proposed TextFuseNet can learn a more adequate description of arbitrary shapes texts, suppressing false positives and producing more accurate detection results. Our proposed framework can also be trained with weak supervision for those datasets that lack character-level annotations. Experiments on several datasets show that the proposed TextFuseNet achieves state-of-the-art performance. Specifically, we achieve an F-measure of 94.3% on ICDAR2013, 92.1% on ICDAR2015,87.1% on Total-Text and 86.6% on CTW-1500, respectively.

@inproceedings{ijcai2020-72, title={TextFuseNet: Scene Text Detection with Richer Fused Features}, author={Ye, Jian and Chen, Zhe and Liu, Juhua and Du, Bo}, booktitle={Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, {IJCAI-20}}, publisher={International Joint Conferences on Artificial Intelligence Organization}, pages={516--522}, year={2020} }

Source code

method: VARCO2020-12-15

Authors: Jaemyung Lee, Jusung Lee, Younghyun Lee, Joonsoo Lee

Affiliation: NCSOFT

Description: This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.1711117050, Text Localization and Recognition for Efficient Digital Contents Analysis)

method: HIT2020-05-13

Authors: Sihwan Kim and Taejang Park

Affiliation: Hana Institute of Technology

Description: we present the network architecture to maximize conditional log-likelihood by optimizing the lower bound with a proper approximate posterior that has shown impressive performance in several generative model. In addition, by extending layer of latent variables to multiple layers, the network is able to learn scale robust features with no task specific regularization or data augmentation. We provide a detailed analysis and show the results of three public benchmarks to confirm the efficiency and reliability of the proposed algorithm.

Ranking Table

Description Paper Source Code

Date	Method	Recall	Precision	Hmean
2020-07-31	TextFuseNet	90.78%	95.58%	93.11%
2020-12-15	VARCO	89.86%	93.63%	91.71%
2020-05-13	HIT	89.22%	93.85%	91.48%
2018-11-07	CRAFT	89.04%	93.93%	91.42%
2020-01-20	VARCO	90.50%	92.01%	91.25%
2018-01-22	FOTS	89.68%	91.43%	90.55%
2018-12-03	SPCNet_TongJi & UESTC (single scale)	88.68%	91.86%	90.24%
2019-07-12	stela	88.13%	91.38%	89.73%
2017-08-10	SRC-B-MachineLearningLab-v4	87.49%	90.81%	89.12%
2017-12-15	EPTN-SJTU	87.31%	90.62%	88.93%
2020-05-19	Craft++	86.67%	91.07%	88.82%
2020-11-10	Hancom Vision	81.74%	92.94%	86.98%
2017-03-22	MCLAB_TextBoxes_v2	83.29%	89.94%	86.49%
2016-12-16	RRPN-4	83.56%	89.53%	86.44%
2018-01-04	crpn	82.28%	89.65%	85.81%
2018-12-08	Unicamp-SRBR-v2	80.82%	90.49%	85.38%
2016-08-31	MCLAB_TextBoxes	82.28%	87.82%	84.96%
2015-03-26	VGGMaxNet_cmb	78.08%	90.09%	83.66%
2015-04-02	VGGMaxNet_025	79.82%	87.58%	83.52%
2018-12-08	Unicamp-SRBR-v3	75.62%	92.62%	83.26%
2015-03-23	VGGMaxNet_013	76.53%	90.50%	82.93%
2015-03-23	VGGMaxNet_1.6	75.89%	91.32%	82.89%
2019-06-26	std(single-scale)	76.99%	80.98%	78.93%
2016-11-13	RRPN-3	70.50%	88.23%	78.38%
2015-01-01	BUCT_YST	72.15%	83.60%	77.45%
2016-03-16	TextConv+WordGraph	67.67%	89.49%	77.07%
2014-11-12	HUST_MCLAB	68.49%	83.33%	75.19%
2018-12-08	Unicamp-SRBR-v1	63.20%	88.04%	73.58%
2015-04-03	StradVision	66.03%	80.87%	72.70%
2013-04-09	I2R_NUS_FAR	68.95%	74.46%	71.60%
2013-04-07	USTB_TexStar	61.46%	84.76%	71.25%
2017-03-16	Ali-Amap-xlab-v2	66.03%	76.35%	70.81%
2016-06-23	SRC-B-TextProcessingLab	64.02%	79.03%	70.74%
2017-10-12	TextFCN V2	74.52%	66.61%	70.34%
2013-04-08	CASIA_NLPR	66.12%	74.64%	70.12%
2016-12-04	Ali-Amap-xlab	64.93%	75.96%	70.01%
2013-04-05	TextSpotter	61.19%	81.61%	69.94%
2015-03-23	VGGMaxNet_10	54.70%	96.61%	69.85%
2015-07-22	ZText	60.00%	82.95%	69.63%
2015-11-04	MSER_Binary_CNN	63.29%	76.49%	69.27%
2013-04-08	I2R_NUS	65.30%	72.08%	68.52%
2018-12-29	fast_ret_sh_02	58.81%	76.76%	66.60%
2014-08-18	DetectText	59.82%	71.58%	65.17%
2013-04-08	Text_detector_CASIA	54.70%	80.19%	65.04%
2013-08-29	UMD_IntegratedDisrimination	52.69%	81.61%	64.04%
2013-04-08	TH-TextLoc	50.78%	59.66%	54.86%
2015-08-18	MSER with LocalSWT	38.90%	60.43%	47.33%
2013-04-06	Text Detection	34.25%	60.29%	43.68%
2014-06-10	IWRR2014	32.24%	56.12%	40.95%
2017-03-05	WeText	31.23%	56.16%	40.14%
2016-11-08	CTPN	28.40%	54.85%	37.42%
2013-04-10	Inkam	28.04%	29.10%	28.56%

Inactive evaluations

method: TextFuseNet2020-07-31

method: VARCO2020-12-15

method: HIT2020-05-13

Ranking Table

Ranking Graphic