method: NXB OCR2019-06-03

Authors: Yupeng Cao(X),Jie Zhang(B), Qiufeng Wang*(X),Jing Li(X), Qi Qu(B), Cheng Cheng*(N), Kaizhu Huang*(X) (Equal Contribution)

Description: A text detector based on semantic segmentation is used. It consists of EAST[1] and PSENET[2], Model ensembling technique is used to increase accuracy. A CNN-based method is used for training script identification classifier in cropped word images. Using only ICDAR 2017 MLT training set and ICDAR 2019 training set. In script identification part, we choose CNN-based method to finished.

P.S.Affiliation of Authors
(X:Xi’an Jiaotong-liverpool University ;
N:Institute of Nanotechnology and Nano-Bionics, Chinese Academy of Sciences ;
B:Beijing Babel Tenchnology Co., Ltd.)

[1] X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang. East: an efficient and accurate scene text detector. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 5551–5560, 2017. [2] X. Li, W. Wang, W. Hou, R.-Z. Liu, T. Lu, and J. Yang. Shape robust text detection with progressive scale expansion network. arXiv preprint arXiv:1806.02559, 2018.