- Task 2 - Script identification - Method: An approach towards Word-Level Multi-Script Identification using Deep Transfer Features and SVM
- Method info
- Samples list
- Per sample details
method: An approach towards Word-Level Multi-Script Identification using Deep Transfer Features and SVM2017-07-01
Authors: Arindam Das, Saikat Roy
Description: An approach towards Word-Level Multi-Script Identification using Deep Transfer Features and SVM Method description: A pre-trained model of VGG16 is used where weights are adapted the problem of script identification. Each labeled image is initially resized to 224x224 and passed through this deep CNN as a 3D matrix to extract features. The images in each set are first normalized based on mean and standard deviation of the training set. The CNN was not trained further, but the features (4096 sized vectors) are extracted from the last fully connected layer through forward propagation (for each dataset). An SVM with RBF Kernel is used as classifier and trained on the training set. An accuracy of 85.03% was achieved on the validation set, the same hyper-parameters are used to predict the scripts in the test set.
Confusion Matrix
Detection | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Arabic | Latin | Chinese | Japanese | Korean | Bangla | Symbols | Mixed | None | ||
GT | Arabic | 3024 | 1914 | 28 | 51 | 69 | 50 | 6 | 0 | 0 |
Latin | 224 | 59197 | 254 | 344 | 284 | 128 | 106 | 0 | 0 | |
Chinese | 69 | 1585 | 2273 | 652 | 150 | 19 | 2 | 0 | 0 | |
Japanese | 129 | 5328 | 548 | 1861 | 257 | 25 | 9 | 0 | 0 | |
Korean | 204 | 6681 | 496 | 993 | 4495 | 118 | 5 | 0 | 0 | |
Bangla | 42 | 859 | 35 | 63 | 37 | 1508 | 1 | 0 | 0 | |
Symbols | 14 | 2797 | 1 | 9 | 4 | 4 | 667 | 0 | 0 | |
Mixed | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
None | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |