- Task 1 - Text Localization
- Task 2 - Script identification
- Task 3 - Joint text detection and script identification
- Task 4 - End-to-End text detection and recognition
method: Baidu-VIS2020-06-30
Authors: VIS-VAR Team, Baidu Inc.*
Affiliation: VIS-VAR Team, Baidu Inc.*
Description: We are from the Department of Computer Vison, Baidu Inc. Our method mainly composes of three parts:Text detection, Script identification and Text recognition. Text detection mainly relies on LOMO and EAST, Multi-scale testing is adopted and the final result is boosted with Resnet-50 and Inception-v4 as different backbones. Next, all text lines are recognized by the unified language classification model to identify the script of the text. Eight single-language text recognition models based on Res-SENet are used to finally recognize the text line images.
method: Tencent-DPPR Team & USTB-PRIR2019-06-04
Authors: Sicong Liu, Longhuang Wu, Shangxuan Tian, Haoxi Li, Chunchao Guo, Haibo Qin, Chang Liu, Hongfa Wang, Hongkai Chen, Qinglin lu, Chun Yang, Xucheng Yin, Lei Xiao
Description: We are Tencent-DPPR (Data Platform Precision Recommendation) team. Our detection method follows the framework of Mask R-CNN that employs mask to detect multi-oriented scene texts. We use the MLT-19 and the MSRA-TD500 dataset to train our text detector, and we also apply a multi-scale training approach during training. To obtain the final ensemble detection results, we combined two different backbones and different multi-scale testing approaches. Our recognition methods base on CTC/Seq2Seq and CNN with self-attention/RNN. Then cropped words are recognized using different models to obtain ensemble results.
method: Tencent-DPPR Team & USTB-PRIR (Method_v0.2)2019-06-03
Authors: Sicong Liu, Longhuang Wu, Shangxuan Tian, Haoxi Li, Chunchao Guo, Haibo Qin, Chang Liu, Hongfa Wang, Hongkai Chen, Qinglin lu, Chun Yang, Xucheng Yin, Lei Xiao
Description: We are Tencent-DPPR (Data Platform Precision Recommendation) team. Our detection method follows the framework of Mask R-CNN that employs mask to detect multi-oriented scene texts. We use the MLT-19 and the MSRA-TD500 dataset to train our text detector, and we also apply a multi-scale training approach during training. To obtain the final ensemble detection results, we combined two different backbones and different multi-scale testing approaches. Our recognition methods base on CTC/Seq2Seq and CNN with self-attention/RNN. Then cropped words are recognized using different models to obtain ensemble results.
Date | Method | Hmean | Precision | Recall | Average Precision | 1-NED | 1-NED (Case Sens.) | Hmean (Case Sens.) | |||
---|---|---|---|---|---|---|---|---|---|---|---|
2020-06-30 | Baidu-VIS | 59.72% | 72.82% | 50.62% | 41.32% | 57.26% | 56.97% | 59.01% | |||
2019-06-04 | Tencent-DPPR Team & USTB-PRIR | 59.15% | 71.26% | 50.55% | 35.92% | 58.46% | 58.10% | 58.37% | |||
2019-06-03 | Tencent-DPPR Team & USTB-PRIR (Method_v0.2) | 58.92% | 71.67% | 50.02% | 41.76% | 58.00% | 57.64% | 58.14% | |||
2019-06-03 | CRAFTS | 51.74% | 65.68% | 42.68% | 34.95% | 48.27% | 47.75% | 50.74% | |||
2019-05-27 | Tencent-DPPR Team & USTB-PRIR (Method_v0.1) | 51.70% | 56.12% | 47.93% | 26.88% | 56.18% | 55.65% | 50.86% | |||
2019-06-04 | mask_rcnn-transformer | 51.04% | 52.51% | 49.64% | 25.96% | 55.71% | 54.10% | 49.34% | |||
2019-06-03 | mask_rcnn-transformer | 50.44% | 51.90% | 49.07% | 25.34% | 55.28% | 54.14% | 49.11% | |||
2019-05-28 | CRAFTS(Initial) | 46.99% | 66.21% | 36.41% | 30.54% | 42.52% | 42.01% | 45.97% | |||
2019-06-04 | Three-stage method | 40.19% | 44.37% | 36.73% | 17.82% | 46.01% | 43.86% | 37.45% | |||
2019-06-04 | TH-DL-v2 | 37.32% | 41.22% | 34.10% | 19.73% | 46.19% | 45.68% | 36.50% | |||
2019-06-03 | TH-DL-v1 | 34.49% | 38.10% | 31.51% | 17.48% | 42.76% | 42.25% | 33.69% | |||
2019-06-03 | NXB OCR | 32.07% | 34.37% | 30.06% | 10.35% | 35.48% | 35.06% | 31.50% | |||
2019-05-27 | TH-DL | 31.69% | 35.13% | 28.87% | 14.33% | 40.39% | 39.82% | 30.79% | |||
2019-05-27 | NXB OCR | 28.42% | 33.39% | 24.74% | 7.96% | 31.50% | 31.19% | 27.93% | |||
2019-05-22 | E2E-MLT | 26.46% | 37.44% | 20.47% | 7.72% | 26.39% | 25.71% | 24.85% |