Method: 4Paradigm-Data-Intelligence - Task 2 - Script identification - ICDAR2017 Competition on Multi-lingual scene text detection and script identification

method: 4Paradigm-Data-Intelligence2019-05-30

Authors: ACVG

Description: Recognition model: Based on Transformer with backbone ResNet50. A voting process is done to identify the language of recognized transcript. Train-set: 2017 MLT task2 train-set & 2019 MLT task2 train-set & 2019 MLT Synthetic dataset.

Confusion Matrix

		Detection
		Arabic	Latin	Chinese	Japanese	Korean	Bangla	Symbols	Mixed	None
GT	Arabic	4980	114	5	11	12	9	11	0	0
	Latin	423	59021	99	197	410	143	244	0	0
	Chinese	26	309	3481	632	253	19	30	0	0
	Japanese	192	1508	1242	4438	613	82	82	0	0
	Korean	110	1235	257	345	10804	141	100	0	0
	Bangla	10	84	2	7	56	2383	3	0	0
	Symbols	126	406	6	6	81	5	2866	0	0
	Mixed	0	0	0	0	0	0	0	0	0
	None	0	0	0	0	0	0	0	0	0