method: SCUT-DLVC-Lab2019-06-03

Authors: Canjie Luo, Tianwei Wang, Qingxiang Lin, Xiaoxue Chen, Ziyan Li, Jiaxin Zhang, Yunlong Huang, Shuaitao Zhang, Lianwen Jin

Description: This is the final result submitted by the researchers from DLVC-Lab in South China University of Technology. An image-to-sequence recognition network (a CNN-LSTM framework) is applied to extract text from the cropped image. Based on the predict string, we choose the language that most characters appear in the alphabet of this language. The training datasets are the official released training sets, including the synthesized datasets.

Confusion Matrix

Detection
ArabicLatinChineseJapaneseKoreanBanglaHindiSymbolsNone
GTArabic4865178153113167170
Latin22658578344501544110872470
Chinese61144183380561550
Japanese501158159249642862345390
Korean411296361448107483349160
Bangla1162451723529220
Hindi24722116415310
Symbols375012048355533640
None000000000