method: Tencent TEG OCR2020-03-15

Authors: Pei Xu, Shan Huang, Shen Huang, Qi Ju.

Description: We reimplemented the standalone recognition method according to the end-to-end text spotting code released by the Mask TextSpotter[TPAMI]. It is a seq-to-seq method based on 2D attention. We synthesize curved text images for pretraining by the method of VGG synthtext. We add public dataset including icdar2013-2015, CUTE, SVT, IIIT5k, RCTW2017, LSVT,ctw to finetune and don't use any private data.

Authors: Jeonghun Baek, Moonbin Yim, Junyeop Lee, and Hwalsuk Lee

Description: Before text recognition, we used the text detector called CRAFT as a preprocessing step.
For a recognition model, we used Thin-plate-spline (TPS) based Spatial transformer network (STN) which normalizes the input text images, ResNet based feature extractor, BiLSTM, and attention mechanism.
This model was developed based on the analysis of scene text recognition modules.
See our paper and source code.

Authors: Honbin Sun, Xiaomeng Song, Xiaoyu Yue, Youjiang Xu, Zhanghui Kuang SenseTime Group

Description: We propose an end-to-end model for multi-oriented scene text recognition. Our model is composed of a 31-layer ResNet, a GRU-based encoder-decoder framework and a 2-dimensional attention module. Specifically, the ResNet is used to extract CNN feature maps for input images, a 2-layer GRU is used to receive one column or one row of the 2D feature maps followed by max-pooling alone the vertical or horizontal axis, another 2-layer GRU is used for text classification at each step. More importantly, we use a tailored 2D attention mechanism to focus text-relevant regions in feature maps based on current hidden state which contains semantic information of previous decode results at each decode step.

Ranking Table

Description Paper Source Code
DateMethodResultTotal wordsCorrect words
2020-03-15Tencent TEG OCR85.74%4842635011
2019-05-01CRAFT (Preprocessing) + TPS-ResNet85.32%4842634206
2019-05-01Attention based method for arbitrary-shaped scene text recognition85.20%4842632495
2019-05-01Attention based method for scene text recognition85.18%4842632534
2020-10-15transformer_v184.66%4842633725
2019-04-30TPS-ResNet83.63%4842633173
2020-07-09Lenovo-MI-Lab OCR81.81%4842631201
2019-04-30CSN-ED81.23%4842630018
2019-04-28class_5435_rotate80.60%4842629141
2019-04-29MatchCRNN72.61%4842624781
2019-04-27Ensemble and post processes 71.27%4842626344
2019-05-01So Cold 2.069.76%4842620650
2021-01-08SogouMM67.08%4842628362
2019-11-06Sogou_OCR66.92%4842627571
2019-11-08SogouMM66.81%4842628272
2019-05-01Fudan-Supremind Recognition66.15%4842620827
2019-10-24hw-noah-lab-gts-CV-team66.13%4842627029
2019-04-30CUTeOCR65.38%4842626078
2019-04-30PKU Team Zero65.06%4842626216
2019-04-29NPU-ASGO63.82%4842625341
2019-05-01CIGIT & XJTLU63.15%4842624956
2019-05-01Alchera AI61.61%4842623573
2019-04-30Irregular Text Recognizer with Attention Mechanism61.42%4842622739
2019-04-30LCT_OCR(中国科学院信息工程研究所)59.77%4842621432
2019-04-26Irregular Text Recognition with Direction Classification and a Rectification Network58.41%4842621846
2019-04-29task2x56.53%4842617787
2019-04-21Arbitrary shape scene text recognition based on CNN and Attention Enhanced Bi-directional LSTM54.49%4842619792

Ranking Graphic