method: GSPA_HUST2019-05-28
Authors: Changxu Cheng, Qiuhui Huang, Wuheng Xu and Hao Wang at Huazhong University of Science and Technology
Description: We use Global Squeezer (GS) and Patch Aggregator (PA) to globally and locally extract features from the full-size cropped text images. GS is a branch consisting of GAP and a linear classifier to squeeze global features. PA make full use of local prediction to aggregate local discriminative faetures. The softermax loss is used to make intermediate supervision. In the training phase, grouping resizing is adopted to adapt the batch training where the samples in each batch must have the same size, realized by resizing the images with similar aspect ratios to the proper fixed aspect ratio. Data augmentation is also utilized to make model robust. The backbone is VGG16.
The final version.
Confusion Matrix
Detection | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Arabic | Latin | Chinese | Japanese | Korean | Bangla | Hindi | Symbols | None | ||
GT | Arabic | 4930 | 158 | 7 | 11 | 20 | 3 | 2 | 11 | 0 |
Latin | 346 | 58859 | 227 | 366 | 514 | 78 | 56 | 191 | 0 | |
Chinese | 11 | 227 | 3951 | 465 | 70 | 9 | 3 | 14 | 0 | |
Japanese | 76 | 1731 | 1120 | 4788 | 362 | 18 | 38 | 24 | 0 | |
Korean | 30 | 1358 | 206 | 192 | 11161 | 16 | 20 | 9 | 0 | |
Bangla | 9 | 138 | 8 | 8 | 18 | 2247 | 116 | 1 | 0 | |
Hindi | 4 | 24 | 1 | 3 | 1 | 2 | 4188 | 1 | 0 | |
Symbols | 62 | 686 | 7 | 56 | 54 | 3 | 5 | 3142 | 0 | |
None | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |