method: Task1-re52019-04-30

Authors: Yumei Li, Jianwei Wu, Wenhao He (angelicohe@tencent.com), Tao Xue, Long Liu

Description: We combine the results of CNN and RNN models. Firstly, we recognize characters by sliding the text line image with character models, which are learned in an end-to-end manner on text line images labeled with text transcripts. The character classifier outputs on the sliding windows are normalized and decoded with Connectionist Temporal Classification (CTC) based algorithm. Secondly, we use a neural network model — based on Convolutional Neural Networks, Recurrent Neural Networks and a novel attention mechanism to get the results. Finally, we do post-processing based on the dictionary, and vote for the final results. In addition to the training set provided by ReCTS 2019, we used the public dataset, including MLT,ICDAR,CASIA10K,COCO-Text, and synthetic data sets.

Organization: Tencent Map Big Data Lab Image Recognition Team

References

[1] Fei Yin, Yi-Chao Wu, Xu-Yao Zhang, Cheng-Lin Liu.Scene Text Recognition with Sliding Convolutional Character Models.arXiv preprint arXiv: 1709.01727, 2017 http://arxiv.org/abs/1709.01727
[2] Wojna Z, Gorban A N, Lee D S, et al. Attention-based extraction of structured information from street view imagery[C]//2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2017, 1: 844-850.
[3] He W, Zhang X Y, Yin F, et al. Multi-oriented and multi-lingual scene text detection with direct regression[J]. IEEE Transactions on Image Processing, 2018, 27(11): 5406-5419.
[4] Nayef N, Yin F, Bizid I, et al. Icdar2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt[C]. Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2017, 1: 1454-1459.
[5] Shi B, Yao C, Liao M, et al. Icdar2017 competition on reading Chinese text in the wild (rctw-17)[C]. Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2017, 1: 1429-1434.
[6] Gomez R, Shi B, Gomez L, et al. ICDAR2017 robust reading challenge on COCO-Text[C]. Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2017, 1: 1435-1443.