method: Tencent-DPPR Team2019-06-04

Authors: Longhuang Wu, Shangxuan Tian, Haoxi Li, Sicong Liu, Jiachen Li, Chunchao Guo, Haibo Qin, Chang Liu, Hongfa Wang, Hongkai Chen, Qinglin lu, Chun Yang, Xucheng Yin, Lei Xiao

Description: We are Tencent-DPPR (Data Platform Precision Recommendation) team. Our text detector follows the framework of Mask R-CNN that employs mask to detect multi-oriented scene texts. We apply a multi-scale training approach during training. To obtain the final ensemble results, we combined two different backbones and different multi-scale testing approaches. Our recognition method recognizes text lines and their character-level language types using ensemble results of several recognition models, which based on CTC/Seq2Seq and CNN with self-attention/RNN. Finally, we identify the language types of recognized results based on statics of MLT-2019 and Wikipedia corpus.

method: Tencent-DPPR Team2019-06-03

Authors: Longhuang Wu, Shangxuan Tian, Haoxi Li, Sicong Liu, Jiachen Li, Chunchao Guo, Haibo Qin, Chang Liu, Hongfa Wang, Hongkai Chen, Qinglin lu, Chun Yang, Xucheng Yin, Lei Xiao

Description: We are Tencent-DPPR (Data Platform Precision Recommendation) team. Our text detector follows the framework of Mask R-CNN that employs mask to detect multi-oriented scene texts. We apply a multi-scale training approach during training. To obtain the final ensemble results, we combined two different backbones and different multi-scale testing approaches. Our recognition method recognizes text lines and their character-level language types using ensemble results of several recognition models, which based on CTC/Seq2Seq and CNN with self-attention/RNN. After that, we identify the language types of recognized results based on statics of MLT-2019 and Wikipedia corpus.

Authors: Longhuang Wu, Shangxuan Tian, Haoxi Li, Sicong Liu, Jiachen Li, Chunchao Guo, Haibo Qin, Chang Liu, Hongfa Wang, Hongkai Chen, Qinglin lu, Chun Yang, Xucheng Yin, Lei Xiao

Description: We are Tencent-DPPR (Data Platform Precision Recommendation) team. Our text detector follows the framework of Mask R-CNN that employs mask to detect multi-oriented scene texts. We apply a multi-scale training approach during training. To obtain the final ensemble results, we combined two different backbones and different multi-scale testing approaches. Our recognition method uses ensemble results of several recognition models, which based on CTC/Seq2Seq and CNN with self-attention/RNN. Then we identify the language types of recognized results based on statics of MLT-2019 and Wikipedia corpus.

Ranking Table

Description Paper Source Code
DateMethodHmeanPrecisionRecallAverage Precision
2019-06-04Tencent-DPPR Team80.84%87.68%74.99%71.72%
2019-06-03Tencent-DPPR Team80.40%88.46%73.69%70.45%
2019-05-27Tencent-DPPR Team (Method_v0.1)75.64%82.10%70.11%57.66%
2019-05-27Tencent-DPPR Team (Method_v0.2)75.64%82.10%70.11%57.66%
2019-06-04mask_rcnn-transformer75.12%77.26%73.10%56.31%
2019-06-03mask_rcnn-transformer74.62%76.74%72.61%55.52%
2019-06-04TH-DL-v271.01%78.34%64.94%57.21%
2019-06-03TH-DL-v170.19%77.44%64.17%56.38%
2019-05-27TH-DL69.65%77.08%63.52%55.34%
2019-05-29DISTILLED CRAFT68.69%74.97%63.39%54.82%
2019-06-03CRAFTS68.34%78.52%60.50%53.75%
2019-06-03sot + classifier65.66%66.20%65.13%59.49%
2019-05-28CRAFTS(Initial)62.23%72.66%54.43%48.57%
2019-06-03 NXB OCR57.74%61.79%54.18%33.55%
2019-05-27NXB OCR54.51%63.87%47.55%30.44%
2019-05-27TDSI-SE3.86%4.44%3.41%0.15%

Ranking Graphic

Ranking Graphic