Results - ICDAR2017 Competition on Multi-lingual scene text detection and script identification

method: AntAI-Cognition2020-04-22

Authors: Qingpei Guo, Yudong Liu, Pengcheng Yang, Yonggang Li, Yongtao Wang, Jingdong Chen, Wei Chu

Affiliation: Ant Group & PKU

Description: We are from Ant Group & PKU. Our approach is an ensemble method with three text detection models. The text detection models mainly follow the MaskRCNN framework[1], with different backbones(ResNext101-64x4d[2], CBNet[3], ResNext101-32x32d_wsl[4]) used. GBDT[5] is trained to normalize confidence scores and select quadrilateral boxes with the highest quality from all text detection models' outputs. Multi-scale training and testing are adopted for all basic models. For the training set, we also add ICDAR19 MLT datasets, both training & validation sets are used to get the final result.

[1] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969. [2] Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1492-1500. [3] Liu Y, Wang Y, Wang S, et al. Cbnet: A novel composite backbone network architecture for object detection[J]. arXiv preprint arXiv:1909.03625, 2019. [4] Mahajan D, Girshick R, Ramanathan V, et al. Exploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 181-196. [5] Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree[C]//Advances in neural information processing systems. 2017: 3146-3154.

method: TH2020-04-16

Authors: Tsinghua University and Hyundai Motor Group AIRS Company

Email: Shanyu Xiao: xiaosy19@mails.tsinghua.edu.cn

Description: We have built an end-to-end scene text spotter based on Mask R-CNN & Transformer. The ResNeXt-101 backbone and multiscale training/testing are used.

method: Sogou_OCR2019-11-08

Authors: Xudong Rao, Lulu Xu, Long Ma, Xuefeng Su

Description: An arbitrary-shaped text detection method based on Mask R-CNN, we use resnext-152 as our backbone, multi-scale training and testing are adopted to get the final results.

Ranking Table

Description Paper Source Code

Date	Method	Hmean	Precision	Recall	Average Precision
2020-04-22	AntAI-Cognition	84.36%	85.92%	82.86%	78.41%
2020-04-16	TH	84.19%	87.21%	81.38%	78.36%
2019-11-08	Sogou_OCR	83.74%	87.22%	80.54%	76.87%
2019-08-08	JDAI	82.50%	84.24%	80.82%	78.13%
2019-06-02	NJU-ImagineLab	82.40%	83.20%	81.62%	77.98%
2019-05-30	PMTD	81.88%	84.15%	79.74%	76.74%
2019-06-11	4Paradigm-Data-Intelligence	81.07%	81.85%	80.30%	65.57%
2019-05-08	Baidu-VIS	80.75%	83.95%	77.79%	65.11%
2019-05-23	4Paradigm-Data-Intelligence	80.62%	81.80%	79.47%	64.81%
2019-03-23	PMTD	80.49%	83.04%	78.09%	74.19%
2019-12-13	BDN	78.69%	79.18%	78.20%	61.94%
2018-10-29	Amap-CVLab	77.20%	79.64%	74.91%	70.51%
2019-03-29	GNNets (single scale)	76.90%	82.75%	71.83%	64.55%
2018-11-15	USTC-NELSLIP	76.88%	77.47%	76.30%	71.29%
2018-11-28	CRAFT	76.71%	81.30%	72.60%	59.13%
2018-11-20	Pixel-Anchor	76.04%	83.58%	69.75%	58.14%
2018-05-18	PSENet_NJU_ImagineLab (single-scale)	74.94%	78.55%	71.65%	56.39%
2018-12-04	SPCNet_TongJi & UESTC (multi scale)	74.29%	77.17%	71.63%	55.07%
2017-11-09	EAST++	73.88%	78.90%	69.45%	56.21%
2019-07-15	stela	73.72%	78.67%	69.35%	64.14%
2019-01-08	ALGCD_CP	73.18%	76.52%	70.12%	56.41%
2018-03-12	ATL Cangjie OCR	73.04%	75.47%	70.76%	65.19%
2018-12-03	SPCNet_TongJi & UESTC (single scale)	68.08%	68.13%	68.02%	46.20%
2018-12-05	EPTN-SJTU	67.69%	73.30%	62.87%	49.91%
2019-05-30	Thesis-SE	66.83%	72.60%	61.92%	47.65%
2017-06-28	SCUT_DLVClab1	63.21%	76.76%	53.73%	48.24%
2017-06-29	SARI_FDU_RRPN_v1	62.25%	68.90%	56.77%	51.75%
2017-06-28	SARI_FDU_RRPN_v0	59.65%	65.01%	55.12%	48.79%
2017-06-30	Sensetime OCR	57.74%	48.74%	70.83%	60.84%
2017-06-30	TH-DL	43.38%	62.62%	33.18%	29.50%
2017-06-30	linkage-ER-Flow	29.78%	36.84%	24.99%	13.62%

Inactive evaluations

method: AntAI-Cognition2020-04-22

method: TH2020-04-16

method: Sogou_OCR2019-11-08

Ranking Table

Ranking Graphic

Ranking Graphic