Results - ICDAR2017 Competition on Multi-lingual scene text detection and script identification

method: AntAI-Cognition2020-04-22

Authors: Qingpei Guo, Yudong Liu, Pengcheng Yang, Yonggang Li, Yongtao Wang, Jingdong Chen, Wei Chu

Affiliation: Ant Group & PKU

Description: We are from Ant Group & PKU. Our approach is an ensemble method with three text detection models. The text detection models mainly follow the MaskRCNN framework[1], with different backbones(ResNext101-64x4d[2], CBNet[3], ResNext101-32x32d_wsl[4]) used. GBDT[5] is trained to normalize confidence scores and select quadrilateral boxes with the highest quality from all text detection models' outputs. Multi-scale training and testing are adopted for all basic models. For the training set, we also add ICDAR19 MLT datasets, both training & validation sets are used to get the final result.

[1] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969. [2] Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1492-1500. [3] Liu Y, Wang Y, Wang S, et al. Cbnet: A novel composite backbone network architecture for object detection[J]. arXiv preprint arXiv:1909.03625, 2019. [4] Mahajan D, Girshick R, Ramanathan V, et al. Exploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 181-196. [5] Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree[C]//Advances in neural information processing systems. 2017: 3146-3154.

method: TH2020-04-16

Authors: Tsinghua University and Hyundai Motor Group AIRS Company

Email: Shanyu Xiao: xiaosy19@mails.tsinghua.edu.cn

Description: We have built an end-to-end scene text spotter based on Mask R-CNN & Transformer. The ResNeXt-101 backbone and multiscale training/testing are used.

method: Sogou_OCR2019-11-08

Authors: Xudong Rao, Lulu Xu, Long Ma, Xuefeng Su

Description: An arbitrary-shaped text detection method based on Mask R-CNN, we use resnext-152 as our backbone, multi-scale training and testing are adopted to get the final results.

Ranking Table

Description Paper Source Code

Date	Method	Hmean	Precision	Recall	Average Precision
2020-04-22	AntAI-Cognition	16.47%	12.16%	25.54%	3.86%
2020-04-16	TH	14.72%	11.38%	20.85%	3.49%
2019-11-08	Sogou_OCR	13.53%	10.28%	19.77%	1.96%
2019-03-23	PMTD	13.52%	9.69%	22.34%	2.90%
2019-06-02	NJU-ImagineLab	12.06%	8.35%	21.71%	1.96%
2019-05-30	PMTD	11.27%	8.07%	18.68%	1.86%
2019-08-08	JDAI	9.95%	7.12%	16.53%	1.66%
2019-05-08	Baidu-VIS	9.40%	6.86%	14.96%	1.01%
2019-12-13	BDN	9.20%	6.02%	19.48%	1.19%
2018-11-20	Pixel-Anchor	9.15%	6.96%	13.36%	0.82%
2019-06-11	4Paradigm-Data-Intelligence	9.14%	6.21%	17.31%	1.10%
2019-05-23	4Paradigm-Data-Intelligence	8.64%	5.88%	16.22%	0.95%
2019-07-15	stela	7.99%	5.66%	13.59%	0.90%
2018-11-15	USTC-NELSLIP	7.11%	4.71%	14.47%	0.89%
2018-10-29	Amap-CVLab	7.09%	4.99%	12.24%	1.84%
2019-03-29	GNNets (single scale)	6.93%	5.37%	9.75%	0.40%
2018-11-28	CRAFT	6.39%	4.71%	9.93%	0.47%
2018-12-04	SPCNet_TongJi & UESTC (multi scale)	5.90%	3.94%	11.73%	0.46%
2018-05-18	PSENet_NJU_ImagineLab (single-scale)	5.11%	3.69%	8.32%	0.31%
2017-11-09	EAST++	4.40%	3.06%	7.87%	0.15%
2018-12-03	SPCNet_TongJi & UESTC (single scale)	4.18%	2.53%	12.13%	0.31%
2018-03-12	ATL Cangjie OCR	4.05%	2.63%	8.75%	0.38%
2018-12-05	EPTN-SJTU	3.71%	2.41%	7.98%	0.11%
2017-06-28	SCUT_DLVClab1	3.67%	2.67%	5.84%	0.16%
2019-01-08	ALGCD_CP	3.64%	2.38%	7.75%	0.12%
2019-05-30	Thesis-SE	3.46%	2.26%	7.47%	0.11%
2017-06-30	TH-DL	1.84%	1.25%	3.49%	0.08%
2017-06-30	Sensetime OCR	1.62%	0.88%	10.35%	0.20%
2017-06-28	SARI_FDU_RRPN_v0	1.16%	0.72%	3.06%	0.02%
2017-06-29	SARI_FDU_RRPN_v1	0.71%	0.46%	1.57%	0.01%
2017-06-30	linkage-ER-Flow	0.13%	0.07%	0.46%	0.00%

Inactive evaluations

method: AntAI-Cognition2020-04-22

method: TH2020-04-16

method: Sogou_OCR2019-11-08

Ranking Table

Ranking Graphic

Ranking Graphic