method: HIK_OCR2017-07-01

Authors: Zhanzhan Cheng*, Gang Zheng*, Fan Bai, Yunlu Xu, Jie Wang, Ying Yao, Zhaoxuan Fan, Zhiqian Zhang, Yi Niu(*equal contribution)

Description: The method is designed based on the sequence-sequence framework. In the encoder part, images are resized to 100x100, and features are extracted by using a CNN; In the decoder part, character sequence is generated by an attention-based decoder. The novelties of their method include 1) A complicated CNN-based model is proposed for the feature extraction. The model has a few special mechanisms, including mask spatial transform, for handling text of arbitrary placement; 2) Instead of softmax loss, an Edit Probability Loss is developed for training; 3) A self-adaption gate mechanism is adopted to capture global information