method: TPS-ResNet2019-06-04
Authors: Jeonghun Baek, Youngmin Baek, Seung Shin, Bado Lee, Chae Young Lee, and Hwalsuk Lee
Description: we used Thin-plate-spline (TPS) based Spatial transformer network (STN) which normalizes the input text images, ResNet based feature extractor, BiLSTM, and attention mechanism.
This model was developed based on the analysis of scene text recognition modules.
See our paper and source code.
Clova AI OCR Team, NAVER/LINE Corp.
Confusion Matrix
Detection | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Arabic | Latin | Chinese | Japanese | Korean | Bangla | Hindi | Symbols | None | ||
GT | Arabic | 4707 | 301 | 38 | 58 | 15 | 2 | 2 | 19 | 0 |
Latin | 121 | 59133 | 428 | 375 | 244 | 33 | 7 | 296 | 0 | |
Chinese | 47 | 415 | 3769 | 472 | 28 | 4 | 10 | 5 | 0 | |
Japanese | 81 | 1494 | 1233 | 5132 | 164 | 19 | 9 | 25 | 0 | |
Korean | 84 | 1012 | 290 | 168 | 11362 | 24 | 40 | 12 | 0 | |
Bangla | 11 | 106 | 29 | 16 | 20 | 2339 | 23 | 1 | 0 | |
Hindi | 8 | 103 | 26 | 11 | 9 | 22 | 4040 | 5 | 0 | |
Symbols | 30 | 928 | 66 | 311 | 23 | 3 | 3 | 2651 | 0 | |
None | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |