Robust ReadingCompetition
Challenges

method: Tencent-OCR+2017-06-30

Authors: Chunchao Guo, Weichen Zhang, Yi Li, Hui Song, Ming Liu, Hongfa Wang, Lei Xiao

Description: Data Platform Department, Tencent. We adapt CNN-LSTM-CTC architecture to recognize the text line. In addition, a knowledge-based post processing is used for adjusting the result.

method: HIK_OCR2017-07-01

Authors: Zhanzhan Cheng*, Gang Zheng*, Fan Bai, Yunlu Xu, Jie Wang, Yangliu Xu, Ying Yao, Fan Wu, Yi Niu(*equal contribution)

Description: Method is designed based on the sequence-sequence framework:
Encoder: Images are resized to 100pixels x 100pixels, and features are extracted by using complicated CNN;
Decoder: Character sequence generation with Attention-based decoder.

Innovations:
1)We design a complicated CNN-based feature extraction mechanism(Mask spatial transform etc.) for capturing arbitrary placed text features;
2)In order to handle the character additions or deletions problem, we develop an Edit Probalibilty Loss instead of the SofmaxWithLoss in the sequence learning task.
3)We also design a self-adaption gate mechanism for CNN so that network can capture global information.

The papers are in preparation.

method: baseline2017-06-29

Authors: Zhang, yi Pei

Description: The result is based on HOG+LDA algorithm

Ranking Table

Description Paper Source Code
DateMethodTotal Edit distance (case sensitive)Correctly Recognised Words (case sensitive)T.E.D. (case insensitive)C.R.W. (case insensitive)
2017-06-30Tencent-OCR+158.3489.83%121.7791.24%
2017-07-01HIK_OCR198.6588.54%179.4289.17%
2017-06-29baseline1,749.8734.51%1,501.2342.91%
2017-06-26textminer2,399.8224.80%1,577.1450.08%
2017-06-30onceAgain2,744.6813.68%2,038.8829.53%

Ranking Graphic

Ranking Graphic