- Task 1 - Text Localization
- Task 2 - Script identification
- Task 3 - Joint text detection and script identification
method: AntAI-Cognition2020-04-22
Authors: Qingpei Guo, Yudong Liu, Pengcheng Yang, Yonggang Li, Yongtao Wang, Jingdong Chen, Wei Chu
Affiliation: Ant Group & PKU
Email: qingpei.gqp@antgroup.com
Description: We are from Ant Group & PKU. Our approach is an ensemble method with three text detection models. The text detection models mainly follow the MaskRCNN framework[1], with different backbones(ResNext101-64x4d[2], CBNet[3], ResNext101-32x32d_wsl[4]) used. GBDT[5] is trained to normalize confidence scores and select quadrilateral boxes with the highest quality from all text detection models' outputs. Multi-scale training and testing are adopted for all basic models. For the training set, we also add ICDAR19 MLT datasets, both training & validation sets are used to get the final result.
[1] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969. [2] Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1492-1500. [3] Liu Y, Wang Y, Wang S, et al. Cbnet: A novel composite backbone network architecture for object detection[J]. arXiv preprint arXiv:1909.03625, 2019. [4] Mahajan D, Girshick R, Ramanathan V, et al. Exploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 181-196. [5] Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree[C]//Advances in neural information processing systems. 2017: 3146-3154.
method: OSKDet2021-03-21
Authors: ludc
Description: keypoint detection
method: TH2020-04-16
Authors: Tsinghua University and Hyundai Motor Group AIRS Company
Email: Shanyu Xiao: xiaosy19@mails.tsinghua.edu.cn
Description: We have built an end-to-end scene text spotter based on Mask R-CNN & Transformer. The ResNeXt-101 backbone and multiscale training/testing are used.
Date | Method | Hmean | Precision | Recall | Average Precision | |||
---|---|---|---|---|---|---|---|---|
2020-04-22 | AntAI-Cognition | 16.47% | 12.16% | 25.54% | 3.86% | |||
2021-03-21 | OSKDet | 15.59% | 11.94% | 22.45% | 3.44% | |||
2020-04-16 | TH | 14.72% | 11.38% | 20.85% | 3.49% | |||
2019-11-08 | Sogou_OCR | 13.53% | 10.28% | 19.77% | 1.96% | |||
2019-03-23 | PMTD | 13.52% | 9.69% | 22.34% | 2.90% | |||
2019-06-02 | NJU-ImagineLab | 12.06% | 8.35% | 21.71% | 1.96% | |||
2019-08-20 | juxinli | 12.00% | 8.63% | 19.68% | 2.50% | |||
2021-11-02 | fpa | 11.95% | 8.58% | 19.68% | 2.48% | |||
2019-05-30 | PMTD | 11.27% | 8.07% | 18.68% | 1.86% | |||
2024-03-14 | gts | 10.17% | 7.85% | 14.45% | 1.41% | |||
2020-09-28 | DCLNet | 10.05% | 7.39% | 15.68% | 1.28% | |||
2019-08-08 | JDAI | 9.95% | 7.12% | 16.53% | 1.66% | |||
2021-05-03 | NCU_MSP | 9.79% | 7.25% | 15.07% | 1.14% | |||
2019-05-08 | Baidu-VIS | 9.40% | 6.86% | 14.96% | 1.01% | |||
2019-11-05 | baseline_maskrcnn | 9.31% | 6.57% | 15.96% | 1.10% | |||
2021-03-25 | NCU_MSP | 9.20% | 6.68% | 14.79% | 1.01% | |||
2019-12-13 | BDN | 9.20% | 6.02% | 19.48% | 1.19% | |||
2018-11-20 | Pixel-Anchor | 9.15% | 6.96% | 13.36% | 0.82% | |||
2019-06-11 | 4Paradigm-Data-Intelligence | 9.14% | 6.21% | 17.31% | 1.10% | |||
2018-12-22 | PKU_VDIG | 8.88% | 5.73% | 19.77% | 1.65% | |||
2020-10-21 | gccnet-ensemble | 8.71% | 5.73% | 18.16% | 1.70% | |||
2019-05-23 | 4Paradigm-Data-Intelligence | 8.64% | 5.88% | 16.22% | 0.95% | |||
2019-03-19 | ccnet single scale | 8.50% | 6.23% | 13.39% | 0.72% | |||
2023-05-22 | DeepSolo++ (ResNet-50) | 8.48% | 6.59% | 11.90% | 1.69% | |||
2021-05-03 | adapt | 8.32% | 5.64% | 15.85% | 0.94% | |||
2019-07-15 | stela | 7.99% | 5.66% | 13.59% | 0.90% | |||
2022-04-22 | TextBPN++(ResNet-50 with DCN) | 7.80% | 5.56% | 13.10% | 0.73% | |||
2021-03-03 | NCU_MSP_light | 7.58% | 5.07% | 15.05% | 0.86% | |||
2020-12-08 | cascade | 7.43% | 5.33% | 12.24% | 1.36% | |||
2018-11-15 | USTC-NELSLIP | 7.11% | 4.71% | 14.47% | 0.89% | |||
2021-12-12 | a | 7.11% | 4.84% | 13.39% | 0.70% | |||
2018-10-29 | Amap-CVLab | 7.09% | 4.99% | 12.24% | 1.84% | |||
2019-03-29 | GNNets (single scale) | 6.93% | 5.37% | 9.75% | 0.40% | |||
2020-12-08 | corner | 6.88% | 4.65% | 13.24% | 1.09% | |||
2021-12-12 | b | 6.78% | 4.56% | 13.24% | 0.65% | |||
2021-05-17 | NCU_FPN | 6.56% | 4.24% | 14.50% | 0.64% | |||
2020-10-16 | Drew | 6.48% | 4.59% | 10.98% | 0.72% | |||
2018-11-28 | CRAFT | 6.39% | 4.71% | 9.93% | 0.47% | |||
2018-08-23 | Sogou_MM | 6.39% | 4.13% | 14.10% | 0.77% | |||
2022-04-11 | TextBPN++(ResNet-50) | 6.19% | 4.55% | 9.67% | 0.44% | |||
2018-12-04 | SPCNet_TongJi & UESTC (multi scale) | 5.90% | 3.94% | 11.73% | 0.46% | |||
2018-12-13 | AutoCV | 5.81% | 3.53% | 16.33% | 0.81% | |||
2018-05-18 | PSENet_NJU_ImagineLab (single-scale) | 5.11% | 3.69% | 8.32% | 0.31% | |||
2018-12-02 | Shape-Aware Based Scene Text Detector (single scale) | 5.03% | 3.34% | 10.18% | 0.34% | |||
2024-04-02 | FPDIoU | 4.93% | 4.64% | 5.26% | 0.23% | |||
2018-01-22 | FOTS_v2 | 4.90% | 3.45% | 8.47% | 0.27% | |||
2017-11-09 | EAST++ | 4.40% | 3.06% | 7.87% | 0.15% | |||
2018-12-03 | SPCNet_TongJi & UESTC (single scale) | 4.18% | 2.53% | 12.13% | 0.31% | |||
2018-03-12 | ATL Cangjie OCR | 4.05% | 2.63% | 8.75% | 0.38% | |||
2021-12-31 | TextPMs | 3.98% | 2.80% | 6.92% | 0.19% | |||
2023-12-17 | mlt_ch_03 | 3.79% | 2.59% | 7.07% | 0.21% | |||
2018-12-05 | EPTN-SJTU | 3.71% | 2.41% | 7.98% | 0.11% | |||
2017-06-28 | SCUT_DLVClab1 | 3.67% | 2.67% | 5.84% | 0.16% | |||
2019-01-08 | ALGCD_CP | 3.64% | 2.38% | 7.75% | 0.12% | |||
2019-05-30 | Thesis-SE | 3.46% | 2.26% | 7.47% | 0.11% | |||
2022-01-05 | dbnet_resnet18 | 3.05% | 1.91% | 7.49% | 0.31% | |||
2019-09-18 | mask RCNN Augment+ | 2.85% | 2.06% | 4.63% | 0.09% | |||
2017-06-30 | TH-DL | 1.84% | 1.25% | 3.49% | 0.08% | |||
2017-06-30 | Sensetime OCR | 1.62% | 0.88% | 10.35% | 0.20% | |||
2019-01-03 | YY AI OCR Group | 1.20% | 0.74% | 3.12% | 0.01% | |||
2017-06-28 | SARI_FDU_RRPN_v0 | 1.16% | 0.72% | 3.06% | 0.02% | |||
2017-06-29 | SARI_FDU_RRPN_v1 | 0.71% | 0.46% | 1.57% | 0.01% | |||
2019-10-14 | TextSnake | 0.26% | 0.14% | 1.34% | 0.00% | |||
2017-06-30 | linkage-ER-Flow | 0.13% | 0.07% | 0.46% | 0.00% |