method: NXB OCR2019-06-03
Authors: Yupeng Cao(X), Qiufeng Wang*(X), Qi Qu(B), Jing Li(X), Cheng Cheng*(N), Kaizhu Huang*(X) (Equal Contribution)
Description: A CNN-based method is used for training script identification classifier in cropped word images [1]. We use VGG19 architecture as the training model. The images are resized into 32*32. For each convolutional layer, we add the batch normalization and choose max pooling as the pooling layer.
P.S.Affiliation of Authors
(X:Xi’an Jiaotong-liverpool University ;
N:Institute of Nanotechnology and Nano-Bionics, Chinese Academy of Sciences ;
B:Beijing Babel Tenchnology Co., Ltd.)
[1]. Bušta M, Patel Y, Matas J. E2E-MLT-an unconstrained end-to-end method for multi-language scene text[J]. arXiv preprint arXiv:1801.09919, 2018.
Confusion Matrix
Detection | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Arabic | Latin | Chinese | Japanese | Korean | Bangla | Hindi | Symbols | None | ||
GT | Arabic | 4531 | 473 | 18 | 54 | 35 | 9 | 9 | 13 | 0 |
Latin | 304 | 57835 | 403 | 825 | 662 | 176 | 140 | 292 | 0 | |
Chinese | 19 | 469 | 3055 | 1062 | 90 | 24 | 16 | 15 | 0 | |
Japanese | 94 | 2351 | 1286 | 3923 | 319 | 66 | 72 | 46 | 0 | |
Korean | 99 | 2627 | 515 | 667 | 8964 | 63 | 32 | 25 | 0 | |
Bangla | 4 | 224 | 18 | 31 | 16 | 2096 | 155 | 1 | 0 | |
Hindi | 6 | 109 | 5 | 9 | 2 | 71 | 4020 | 2 | 0 | |
Symbols | 63 | 1192 | 8 | 174 | 30 | 8 | 15 | 2525 | 0 | |
None | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |