Method: TH-CNN - Task 2 - Script identification - ICDAR2017 Competition on Multi-lingual scene text detection and script identification

method: TH-CNN2017-07-01

Authors: Yejun Tang, Haoyu Qin, Liangrui Peng, Department of Electronic Engineering, Tsinghua University, Beijing, China

Description: A simplified GoogLeNet is used (Caffe implementation). The network is trained by using augmented samples. The original samples in the training set are rotated, blurred, mirrored and inverted. The numbers of training sam- ples of different scripts are balanced. The input images are resized into 256x256 pixels and cropped into 227x227 pixels.

Confusion Matrix

		Detection
		Arabic	Latin	Chinese	Japanese	Korean	Bangla	Symbols	Mixed	None
GT	Arabic	346	3299	194	482	462	257	102	0	0
	Latin	4394	39592	1856	5384	5097	2821	1393	0	0
	Chinese	400	3147	139	397	361	193	113	0	0
	Japanese	722	5219	242	749	682	379	164	0	0
	Korean	1034	8507	388	1148	1194	406	315	0	0
	Bangla	159	1759	88	195	199	83	62	0	0
	Symbols	264	2316	93	285	291	158	89	0	0
	Mixed	0	0	0	0	0	0	0	0	0
	None	0	0	0	0	0	0	0	0	0