Results - ICDAR2017 Competition on Multi-lingual scene text detection and script identification

method: TH2020-04-16

Authors: Tsinghua University and Hyundai Motor Group AIRS Company

Email: Shanyu Xiao: xiaosy19@mails.tsinghua.edu.cn

Description: We have built an end-to-end scene text spotter based on Mask R-CNN & Transformer. The ResNeXt-101 backbone and multiscale training/testing are used.

method: Sogou_OCR2019-11-08

Authors: Xudong Rao, Lulu Xu, Long Ma, Xuefeng Su

Description: An arbitrary-shaped text detection method based on Mask R-CNN, we use resnext-152 as our backbone, multi-scale training and testing are adopted to get the final results.

method: AntAI-Cognition2020-04-22

Authors: Qingpei Guo, Yudong Liu, Pengcheng Yang, Yonggang Li, Yongtao Wang, Jingdong Chen, Wei Chu

Affiliation: Ant Group & PKU

Email: qingpei.gqp@antgroup.com

Description: We are from Ant Group & PKU. Our approach is an ensemble method with three text detection models. The text detection models mainly follow the MaskRCNN framework[1], with different backbones(ResNext101-64x4d[2], CBNet[3], ResNext101-32x32d_wsl[4]) used. GBDT[5] is trained to normalize confidence scores and select quadrilateral boxes with the highest quality from all text detection models' outputs. Multi-scale training and testing are adopted for all basic models. For the training set, we also add ICDAR19 MLT datasets, both training & validation sets are used to get the final result.

[1] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969. [2] Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1492-1500. [3] Liu Y, Wang Y, Wang S, et al. Cbnet: A novel composite backbone network architecture for object detection[J]. arXiv preprint arXiv:1909.03625, 2019. [4] Mahajan D, Girshick R, Ramanathan V, et al. Exploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 181-196. [5] Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree[C]//Advances in neural information processing systems. 2017: 3146-3154.

Ranking Table

Description Paper Source Code

Date	Method	Hmean	Precision	Recall	Average Precision
2020-04-16	TH	44.92%	29.49%	94.22%	75.64%
2019-11-08	Sogou_OCR	44.89%	29.13%	97.76%	85.12%
2020-04-22	AntAI-Cognition	42.78%	27.46%	96.66%	84.29%
2018-11-20	Pixel-Anchor	40.29%	26.10%	88.29%	51.88%
2019-03-29	GNNets (single scale)	38.92%	25.45%	82.71%	34.04%
2019-08-08	JDAI	38.52%	24.15%	95.21%	77.19%
2019-05-30	PMTD	38.51%	24.22%	93.95%	82.23%
2019-05-08	Baidu-VIS	38.13%	24.12%	91.00%	22.86%
2019-03-23	PMTD	37.55%	23.71%	90.18%	49.86%
2017-06-28	SCUT_DLVClab1	36.60%	23.06%	88.68%	72.16%
2019-06-02	NJU-ImagineLab	36.43%	22.49%	95.80%	82.09%
2018-10-29	Amap-CVLab	35.12%	21.79%	90.53%	69.38%
2018-11-28	CRAFT	35.05%	22.27%	82.32%	19.53%
2019-06-11	4Paradigm-Data-Intelligence	33.95%	20.71%	94.15%	20.21%
2019-05-23	4Paradigm-Data-Intelligence	33.46%	20.43%	92.30%	19.04%
2018-05-18	PSENet_NJU_ImagineLab (single-scale)	33.21%	20.94%	80.16%	17.24%
2019-07-15	stela	32.40%	20.21%	81.69%	60.02%
2018-11-15	USTC-NELSLIP	31.22%	18.74%	93.60%	81.67%
2018-12-04	SPCNet_TongJi & UESTC (multi scale)	30.98%	18.66%	91.16%	17.08%
2019-12-13	BDN	30.57%	18.26%	93.71%	18.50%
2017-11-09	EAST++	28.99%	17.83%	77.49%	22.17%
2017-06-30	TH-DL	28.58%	17.37%	80.63%	52.72%
2018-03-12	ATL Cangjie OCR	27.93%	16.56%	89.12%	60.12%
2019-01-08	ALGCD_CP	27.75%	16.50%	87.23%	17.27%
2017-06-29	SARI_FDU_RRPN_v1	26.38%	15.53%	87.39%	61.20%
2018-12-05	EPTN-SJTU	25.29%	14.98%	81.02%	20.12%
2019-05-30	Thesis-SE	24.04%	14.24%	77.13%	14.34%
2018-12-03	SPCNet_TongJi & UESTC (single scale)	22.24%	12.62%	93.56%	11.97%
2017-06-28	SARI_FDU_RRPN_v0	21.52%	12.36%	83.34%	43.90%
2017-06-30	Sensetime OCR	10.32%	5.46%	93.44%	60.68%
2017-06-30	linkage-ER-Flow	3.20%	1.78%	15.68%	0.38%

Inactive evaluations

method: TH2020-04-16

method: Sogou_OCR2019-11-08

method: AntAI-Cognition2020-04-22

Ranking Table

Ranking Graphic

Ranking Graphic