Method: CNN based method 2 - Task 2 - Script identification - ICDAR2017 Competition on Multi-lingual scene text detection and script identification

method: CNN based method 22017-06-30

Authors: Yash Patel, Michal Bušta, Lukáš Neumann, Jiri Matas

Description: Our method uses a CNN based approach for script-identification in cropped work images. We employ the use of convolutional layers from VGG-16 architecture along with a Global-Average-Pooling and two fully connected layers. Objective of our method is to preserve the aspect ratio of input images. Thus, for both training and testing we resize the images into fixed-height (64) and variable-width ((image width*64)/image height) tensors. For training, we initialize the convolutional layers with ImageNet weights. We make use of categorical-cross-entropy loss function and update all the layers (both convolutional and fully connected) during back-propagation.

Confusion Matrix

		Detection
		Arabic	Latin	Chinese	Japanese	Korean	Bangla	Symbols	Mixed	None
GT	Arabic	4683	341	25	50	16	12	15	0	0
	Latin	185	58064	549	1021	423	124	171	0	0
	Chinese	7	208	3733	727	62	7	6	0	0
	Japanese	44	1772	1334	4640	313	31	23	0	0
	Korean	39	2324	846	923	8756	92	12	0	0
	Bangla	5	206	32	35	19	2246	2	0	0
	Symbols	46	914	21	67	19	9	2420	0	0
	Mixed	0	0	0	0	0	0	0	0	0
	None	0	0	0	0	0	0	0	0	0