method: HIK_OCR2017-07-01

Authors: Zhanzhan Cheng*, Gang Zheng*, Fan Bai, Yunlu Xu, Jie Wang, Yangliu Xu, Ying Yao, Fan Wu, Yi Niu(*equal contribution)

Description: Method is designed based on the sequence-sequence framework:
Encoder: Images are resized to 100pixels x 100pixels, and features are extracted by using complicated CNN;
Decoder: Character sequence generation with Attention-based decoder.

Innovations:
1)We design a complicated CNN-based feature extraction mechanism(Mask spatial transform etc.) for capturing arbitrary placed text features;
2)In order to handle the character additions or deletions problem, we develop an Edit Probalibilty Loss instead of the SofmaxWithLoss in the sequence learning task.
3)We also design a self-adaption gate mechanism for CNN so that network can capture global information.

The papers are in preparation.