Results - ICDAR 2019 Robust Reading Challenge on Multi-lingual scene text detection and recognition

method: SituTech_OCR2021-03-11

Authors: Kui Lyu, Chuanhe Liu

Affiliation: Beijing Situ Vision Technologies Co. Ltd

Description: In this work, we design an elegant text detection model. Our detector is similar to DBNet, but there are some difference. More specifically, we have introduced an advanced detector backbone, a classic network EfficientDet, with flexible scales and stronger ability to extract features. Another breakthrough is that we optimized the label generation strategy in DBNet. In the original work, the positive area generation and the expansion of the positive area to the bounding box used the Vatti clipping algorithm, which is less robust with different area perimeter ratios. We optimized this function to make the label transform between positive area and bounding box more reasonable.

If you have any questions, please contact us.
SituAIgorithm Team, Beijing Situ Vision Technologies Co. Ltd

Liao, Minghui, et al. "Real-time scene text detection with differentiable binarization." Proceedings of the AAAI Conference on Artificial Intelligence.

Mingxing Tan, Ruoming Pang, Quoc V. Le. EfficientDet: Scalable and Efficient Object Detection. CVPR 2020.

Source code

Source code 2

method: TH2020-04-19

Authors: Tsinghua University and Hyundai Motor Group AIRS Company

Email: Shanyu Xiao: xiaosy19@mails.tsinghua.edu.cn

Description: We have built an end-to-end scene text spotter based on Mask R-CNN & Transformer. The ResNeXt-101 backbone and multiscale training/testing are used.

method: multi-stage_text_detector_v42019-06-04

Authors: Pengfei Wang~*, Mengyi En*, Xiaoqiang Zhang*, Chengquan Zhang*

Affiliation: VIS-VAR Team, Baidu Inc.*; Xidian University~

Description: The method mainly relies on a two-stage text detector, namely LOMO [1], which is inspired by Mask-R-CNN and where an iterative refinement module is introduced to refine the boundary of text region once or more times during testing to get the more accurate detection results. As extra data sets, ICDAR15 and partial KAIST are also used in the training phase. Multi-scale testing is adopted and the final result is boosted from LOMOs with Resnet-50 and Inception-v4 as different backbones.

*This work is done when Pengfei Wang is an intern at Baidu Inc.

Zhang, Chengquan, et al. "Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes." arXiv preprint arXiv:1904.06535 (2019).

Ranking Table

Description Paper Source Code

Date	Method	Hmean	Precision	Recall	Average Precision
2021-03-11	SituTech_OCR	62.45%	54.60%	72.93%	39.06%
2020-04-19	TH	57.64%	48.07%	71.96%	48.51%
2019-06-04	multi-stage_text_detector_v4	56.37%	45.16%	74.99%	35.34%
2019-06-03	multi-stage_text_detector	55.69%	44.49%	74.44%	34.47%
2019-06-04	multi-stage_text_detector_v3	55.43%	43.99%	74.90%	34.32%
2019-06-04	multi-stage_text_detector_v2	55.31%	43.77%	75.11%	34.18%
2019-05-27	Tencent-DPPR Team (Method_v0.1)	55.18%	47.80%	65.26%	43.16%
2019-11-11	Sogou_OCR	54.73%	45.24%	69.27%	46.57%
2019-06-03	Tencent-DPPR Team (Method_v0.2)	54.49%	43.78%	72.16%	47.86%
2019-06-04	Tencent-DPPR Team (Method_v0.3)	54.29%	43.45%	72.34%	47.80%
2019-06-04	Tencent-DPPR Team	54.29%	43.43%	72.38%	47.78%
2019-06-03	NJU-ImagineLab(v3)	53.62%	42.44%	72.80%	48.78%
2019-05-30	PMTD	53.02%	42.11%	71.56%	49.26%
2022-11-02	ESTextSpotter	48.42%	38.30%	65.82%	42.13%
2019-05-27	TH-DL	47.92%	40.60%	58.47%	28.12%
2019-06-04	TH-DL-v2	47.91%	39.96%	59.81%	29.77%
2019-06-03	TH-DL-v1	47.84%	39.99%	59.52%	29.25%
2019-06-03	mm-maskrcnn_v2	46.98%	38.00%	61.51%	38.58%
2019-05-31	A two-stage text detector based on cascade rcnn	46.31%	36.13%	64.47%	40.73%
2019-06-02	A two-stage text detector based on cascade rcnn(using total 10000 images of mlt19)	45.75%	35.15%	65.54%	40.48%
2019-05-29	IC_RL	45.55%	33.60%	70.70%	24.80%
2021-02-04	NCU_MSP	45.51%	35.00%	65.05%	22.77%
2023-05-22	DeepSolo++ (ResNet-50)	45.45%	40.17%	52.32%	32.32%
2019-05-29	maskrcnn++ result	45.18%	32.88%	72.16%	24.76%
2019-06-02	DISTILLED CRAFT	44.71%	37.51%	55.34%	26.73%
2020-10-16	Drew	43.92%	35.16%	58.47%	32.58%
2019-05-26	two stage text detector	42.58%	33.37%	58.83%	34.28%
2019-06-03	CRAFTS	42.10%	36.28%	50.15%	21.36%
2019-06-03	sot	39.88%	29.95%	59.64%	34.85%
2020-05-30	NCU	39.87%	28.27%	67.60%	19.22%
2019-05-28	CRAFTS(Initial)	38.98%	31.03%	52.41%	17.55%
2019-06-03	text-mountain	37.01%	25.64%	66.47%	17.82%
2019-06-04	Unicamp-SRBR-MLT2019-PELEETEXT	36.70%	28.28%	52.24%	26.22%
2019-06-03	RRPN	36.11%	26.81%	55.28%	27.88%
2019-05-24	PSENet_v1	34.47%	27.67%	45.69%	22.64%
2023-05-30	TD-PPIoU	34.42%	22.91%	69.17%	39.54%
2019-06-04	Unicamp-SRBR-MLT2019-FUSION-PSENET-PELEETEXT	33.89%	25.10%	52.14%	21.80%
2019-05-27	MLT2019 ETD	33.77%	26.20%	47.52%	12.57%
2019-05-27	CLTDR	33.68%	26.83%	45.25%	12.27%
2019-06-04	Lomin OCR	30.29%	21.20%	53.02%	22.43%
2019-05-27	NXB OCR	29.75%	21.98%	46.01%	14.44%
2019-06-03	TP	28.94%	26.06%	32.55%	9.80%
2019-06-03	NXB OCR	28.86%	19.26%	57.55%	11.20%
2019-05-28	Unicamp-SRBR-MLT2019-S1	28.07%	26.13%	30.33%	16.13%
2020-10-07	MEAST_V2_8_oct	27.49%	19.78%	45.04%	13.80%
2020-10-23	MEAST_V3_23_Oct	26.82%	18.88%	46.23%	13.82%
2019-06-04	Cyberspace	26.02%	19.38%	39.59%	8.62%
2019-05-28	PydBox-TextDetector	11.13%	11.63%	10.67%	1.40%
2020-12-15	DSIT-UOA	2.62%	1.54%	8.79%	0.10%
2019-05-05	AAAA	0.01%	0.01%	0.01%	0.00%
2019-05-27	4Paradigm-Data-Intelligence	0.00%	0.00%	0.00%	0.00%
2019-05-27	Unicamp-SRBR-MLT2019-S1	0.00%	0.00%	0.00%	0.00%
2019-06-01	tsinghuaee51_MLT2019	0.00%	0.00%	0.00%	0.00%

Inactive evaluations

method: SituTech_OCR2021-03-11

method: TH2020-04-19

method: multi-stage_text_detector_v42019-06-04

Ranking Table

Ranking Graphic

Ranking Graphic