method: AntAI-Cognition2020-04-22

Authors: Qingpei Guo, Yudong Liu, Pengcheng Yang, Yonggang Li, Yongtao Wang, Jingdong Chen, Wei Chu

Affiliation: Ant Group & PKU

Email: qingpei.gqp@antgroup.com

Description: We are from Ant Group & PKU. Our approach is an ensemble method with three text detection models. The text detection models mainly follow the MaskRCNN framework[1], with different backbones(ResNext101-64x4d[2], CBNet[3], ResNext101-32x32d_wsl[4]) used. GBDT[5] is trained to normalize confidence scores and select quadrilateral boxes with the highest quality from all text detection models' outputs. Multi-scale training and testing are adopted for all basic models. For the training set, we also add ICDAR19 MLT datasets, both training & validation sets are used to get the final result.

[1] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969. [2] Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1492-1500. [3] Liu Y, Wang Y, Wang S, et al. Cbnet: A novel composite backbone network architecture for object detection[J]. arXiv preprint arXiv:1909.03625, 2019. [4] Mahajan D, Girshick R, Ramanathan V, et al. Exploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 181-196. [5] Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree[C]//Advances in neural information processing systems. 2017: 3146-3154.

method: OSKDet2021-03-21

Authors: ludc

Description: keypoint detection

method: TH2020-04-16

Authors: Tsinghua University and Hyundai Motor Group AIRS Company

Email: Shanyu Xiao: xiaosy19@mails.tsinghua.edu.cn

Description: We have built an end-to-end scene text spotter based on Mask R-CNN & Transformer. The ResNeXt-101 backbone and multiscale training/testing are used.

Ranking Table

Description Paper Source Code
DateMethodHmeanPrecisionRecallAverage Precision
2020-04-22 AntAI-Cognition16.47%12.16%25.54%3.86%
2021-03-21OSKDet15.59%11.94%22.45%3.44%
2020-04-16TH14.72%11.38%20.85%3.49%
2019-11-08Sogou_OCR13.53%10.28%19.77%1.96%
2019-03-23PMTD13.52%9.69%22.34%2.90%
2019-06-02NJU-ImagineLab12.06%8.35%21.71%1.96%
2019-08-20juxinli12.00%8.63%19.68%2.50%
2021-11-02fpa11.95%8.58%19.68%2.48%
2019-05-30PMTD11.27%8.07%18.68%1.86%
2024-03-14gts10.17%7.85%14.45%1.41%
2020-09-28DCLNet10.05%7.39%15.68%1.28%
2019-08-08JDAI9.95%7.12%16.53%1.66%
2021-05-03NCU_MSP9.79%7.25%15.07%1.14%
2019-05-08Baidu-VIS9.40%6.86%14.96%1.01%
2019-11-05baseline_maskrcnn9.31%6.57%15.96%1.10%
2021-03-25 NCU_MSP9.20%6.68%14.79%1.01%
2019-12-13BDN9.20%6.02%19.48%1.19%
2018-11-20Pixel-Anchor9.15%6.96%13.36%0.82%
2019-06-11 4Paradigm-Data-Intelligence9.14%6.21%17.31%1.10%
2018-12-22PKU_VDIG8.88%5.73%19.77%1.65%
2020-10-21gccnet-ensemble8.71%5.73%18.16%1.70%
2019-05-234Paradigm-Data-Intelligence8.64%5.88%16.22%0.95%
2019-03-19ccnet single scale8.50%6.23%13.39%0.72%
2023-05-22DeepSolo++ (ResNet-50)8.48%6.59%11.90%1.69%
2021-05-03adapt8.32%5.64%15.85%0.94%
2019-07-15stela7.99%5.66%13.59%0.90%
2022-04-22TextBPN++(ResNet-50 with DCN)7.80%5.56%13.10%0.73%
2021-03-03NCU_MSP_light7.58%5.07%15.05%0.86%
2020-12-08cascade7.43%5.33%12.24%1.36%
2018-11-15USTC-NELSLIP7.11%4.71%14.47%0.89%
2021-12-12a7.11%4.84%13.39%0.70%
2018-10-29Amap-CVLab7.09%4.99%12.24%1.84%
2019-03-29GNNets (single scale)6.93%5.37%9.75%0.40%
2020-12-08corner6.88%4.65%13.24%1.09%
2021-12-12b6.78%4.56%13.24%0.65%
2021-05-17NCU_FPN6.56%4.24%14.50%0.64%
2020-10-16Drew6.48%4.59%10.98%0.72%
2018-11-28CRAFT6.39%4.71%9.93%0.47%
2018-08-23Sogou_MM6.39%4.13%14.10%0.77%
2022-04-11TextBPN++(ResNet-50)6.19%4.55%9.67%0.44%
2018-12-04 SPCNet_TongJi & UESTC (multi scale)5.90%3.94%11.73%0.46%
2018-12-13AutoCV5.81%3.53%16.33%0.81%
2018-05-18PSENet_NJU_ImagineLab (single-scale)5.11%3.69%8.32%0.31%
2018-12-02Shape-Aware Based Scene Text Detector (single scale)5.03%3.34%10.18%0.34%
2024-04-02FPDIoU4.93%4.64%5.26%0.23%
2018-01-22FOTS_v24.90%3.45%8.47%0.27%
2017-11-09EAST++4.40%3.06%7.87%0.15%
2018-12-03SPCNet_TongJi & UESTC (single scale)4.18%2.53%12.13%0.31%
2018-03-12ATL Cangjie OCR4.05%2.63%8.75%0.38%
2021-12-31TextPMs3.98%2.80%6.92%0.19%
2023-12-17mlt_ch_033.79%2.59%7.07%0.21%
2018-12-05EPTN-SJTU3.71%2.41%7.98%0.11%
2017-06-28SCUT_DLVClab13.67%2.67%5.84%0.16%
2019-01-08ALGCD_CP3.64%2.38%7.75%0.12%
2019-05-30Thesis-SE3.46%2.26%7.47%0.11%
2022-01-05dbnet_resnet183.05%1.91%7.49%0.31%
2019-09-18mask RCNN Augment+2.85%2.06%4.63%0.09%
2017-06-30TH-DL1.84%1.25%3.49%0.08%
2017-06-30Sensetime OCR1.62%0.88%10.35%0.20%
2019-01-03YY AI OCR Group1.20%0.74%3.12%0.01%
2017-06-28SARI_FDU_RRPN_v01.16%0.72%3.06%0.02%
2017-06-29SARI_FDU_RRPN_v10.71%0.46%1.57%0.01%
2019-10-14TextSnake0.26%0.14%1.34%0.00%
2017-06-30linkage-ER-Flow0.13%0.07%0.46%0.00%

Ranking Graphic

Ranking Graphic