method: SCUT-DLVC-Lab2019-06-03
Authors: Canjie Luo, Tianwei Wang, Qingxiang Lin, Xiaoxue Chen, Ziyan Li, Jiaxin Zhang, Yunlong Huang, Shuaitao Zhang, Lianwen Jin
Description: This is the final result submitted by the researchers from DLVC-Lab in South China University of Technology. An image-to-sequence recognition network (a CNN-LSTM framework) is applied to extract text from the cropped image. Based on the predict string, we choose the language that most characters appear in the alphabet of this language. The training datasets are the official released training sets, including the synthesized datasets.
Confusion Matrix
Detection | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Arabic | Latin | Chinese | Japanese | Korean | Bangla | Hindi | Symbols | None | ||
GT | Arabic | 4865 | 178 | 15 | 31 | 13 | 16 | 7 | 17 | 0 |
Latin | 226 | 58578 | 344 | 501 | 544 | 110 | 87 | 247 | 0 | |
Chinese | 6 | 114 | 4183 | 380 | 56 | 1 | 5 | 5 | 0 | |
Japanese | 50 | 1158 | 1592 | 4964 | 286 | 23 | 45 | 39 | 0 | |
Korean | 41 | 1296 | 361 | 448 | 10748 | 33 | 49 | 16 | 0 | |
Bangla | 11 | 62 | 4 | 5 | 17 | 2352 | 92 | 2 | 0 | |
Hindi | 2 | 47 | 2 | 2 | 1 | 16 | 4153 | 1 | 0 | |
Symbols | 37 | 501 | 20 | 48 | 35 | 5 | 5 | 3364 | 0 | |
None | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |