Authors: Cheng Cheng*(N), Jie Zhang(B), Qi Qu(B), Qiufeng Wang*(X), Jing Li(X), YuPeng Cao(X), Kaizhu Huang*(X) (Equal Contribution)
Description: A text detector based on semantic segmentation is used, and the methods are mainly inspired by fully convolutional networks. First, CNN is adopted to detect text blocks, from which character candidates are extracted. Then FCN is used to predict the corresponding segmentation masks. Last, segmentation mask is used to ﬁnd suitable rectangular bounding boxes for the text instances. Model ensembling technique is used to increase accuracy.. Using only ICDAR 2017 MLT training set and ICDAR 2019 training set.
P.S.Affiliation of Authors
(N:Institute of Nanotechnology and Nano-Bionics, Chinese Academy of Sciences ；
X:Xi’an Jiaotong-liverpool University ；
B:Beijing Babel Tenchnology Co., Ltd.)
 X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang. East: an efﬁcient and accurate scene text detector. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 5551–5560, 2017.
 X. Li, W. Wang, W. Hou, R.-Z. Liu, T. Lu, and J. Yang. Shape robust text detection with progressive scale expansion network. arXiv preprint arXiv:1806.02559, 2018.