method: 4Paradigm-Data-Intelligence2019-06-03

Authors: Feng Cheng, Lixin Gu, Qingjie Liu, Feng Han, Jingtao Han

Description: The detection model and recognition model are trained separately.
Detection model: Based on Mask-RCNN. multi-scale. Train-set: 2017 MLT task1 train-set.
Recognition model: Based on Transformer with backbone ResNet50. A voting process is done to identify the language of recognized transcript. Train-set: 2017 MLT task2 train-set & 2019 MLT task2 train-set & 2019 MLT Synthetic dataset.

method: CLOVA-AI2019-06-04

Authors: Bado Lee, Youngmin Baek, Hwalsuk Lee

Description: Additional head on Character-level text detection with model distillation. A pretrained detector is used.

CLOVA-AI team, Naver Corp.

method: SCUT-DLVClab22017-06-30

Authors: Yuliang Liu, Canjie Luo, Lianwen Jin, Sheng Zhang, Zhaohai Li, Lele Xie, Zenghui Sun

Description: Two models have been trained separately: one model for text detection and another for classifying scripts. The two models are jointed to output the final results. After generating the detection results, a classification model with 8 classes (including background) is used to discard the detected boxes classified as background with very high confidence. Then a 7-class model was utilized to yield the final results. Since Chinese and Japanese are found to be easily confused by the model, and since only few images simultaneously contain both Chinese and Japanese scripts, a statistical average method is used to modify the Chinese and Japanese classification results.

Ranking Table

Description Paper Source Code
DateMethodHmeanPrecisionRecallAverage Precision
2019-06-034Paradigm-Data-Intelligence75.23%79.26%71.60%56.65%
2019-06-04CLOVA-AI68.31%74.52%63.06%54.56%
2017-06-30SCUT-DLVClab258.08%71.78%48.77%41.42%
2017-06-30TH-DL39.37%58.58%29.65%24.54%

Ranking Graphic

Ranking Graphic