Robust ReadingCompetition

method: HIK_OCR2017-07-01

Authors: Zhanzhan Cheng*, Gang Zheng*, Fan Bai, Yunlu Xu, Jie Wang, Ying Yao, Zhaoxuan Fan, Zhiqian Zhang, Yi Niu(*equal contribution)

Description: The method is designed based on the sequence-sequence framework. In the encoder part, images are resized to 100x100, and features are extracted by using a CNN; In the decoder part, character sequence is generated by an attention-based decoder. The novelties of their method include 1) A complicated CNN-based model is proposed for the feature extraction. The model has a few special mechanisms, including mask spatial transform, for handling text of arbitrary placement; 2) Instead of softmax loss, an Edit Probability Loss is developed for training; 3) A self-adaption gate mechanism is adopted to capture global information

Authors: Tencent-DPPR Team (Chunchao Guo, Weichen Zhang, Yi Li, Hui Song, Ming Liu, Hongfa Wang, Lei Xiao) & USTB-PRIR (Chun Yang, Zejun Li, Jianwei Wu, Jiebo Hou, Xu-Cheng Yin)

Description: Tencent-DPPR (Data Platform Precision Recommendation) Team. First, they use CNN to extract features from images. Second, they employ multiple LSTM- based models to generate different results and thus to obtain a candidate set for each image. Third, they design a heuristic mechanism to select the result with the maximum probability for each image.

method: HKU-VisionLab2017-06-30

Authors: Wei Liu, Chaofeng Chen, Bingbin Liu, Kwan-Yee Kenneth Wong

Description: They propose a Character-Aware Attention Network (Char-Net) for scene text with large spatial deformations. Their Char-Net consists of a hierarchical feature encoder and a LSTM- based decoder. The newly proposed encoder is able to encode the original text image from both word and character levels, which enables our Char-Net to handle severely distorted scene text. The whole neural network can be optimised in an end-to-end fashion. All the training data comes from public datasets for scene text recognition.

Ranking Table

Description Paper Source Code
DateMethodTotal Edit distance (case sensitive)Correctly Recognised Words (case sensitive)T.E.D. (case insensitive)C.R.W. (case insensitive)
2017-06-30Tencent-DPPR Team & USTB-PRIR4,022.1236.91%1,233.4670.83%
2017-10-06CRNN - Sravya5,704.5424.26%3,532.9636.98%
2017-06-30Enhancing Text Recognition Accuracy by Adding External Language Model7,231.8717.88%5,555.8929.69%
2017-06-28LSTM based text recognition6,594.0110.11%4,638.8326.25%

Ranking Graphic

Ranking Graphic