method: Shopee MMU OCR2022-10-31
Authors: Jianqiang Liu, Hanfei Xu, Bin Zheng, Eric W, Ronnie T, Alex X
Affiliation: Shopee MMU OCR
Description: Our method adopts a transformer-based context-aware framework. We utilize a hybrid architecture encoder and a context-aware autoregressive decoder to construct the recognition pipeline. Finally, a simple but effective multi-model fusion strategy is adopted.
method: SogouMM2019-09-05
Authors: Xu Liu, Hongyuan Zhang, Yan Zhang, Bo Qin, Tao Wei
Description: Description: Our method is based on 2D-attention, we simply use ResNet as backbone and a tailored 2D-attention module is applied. The result is generated by a single model without ensemble tricks.
method: SenseTime-CKD2019-06-22
Authors: CKD Team(Xiaocong Cai,Wenyang Hu, Jun Hou,,Miaomiao Cheng)
Description:
1) The method is designed based on the Rectify-Encoder-Decoder framework.
2) Our training data contains about 5, 600, 000 images from Synth90k, SynthText, SynthAdd and some academic dataset.
3) Varying length input is adopted here and the maximum input size is 64x160. Images are rectified by STN(spatial transform network) firstly. Then the rectified images are passed to some CNN backbones(e.g. ResNet) to extract features. As for the decoder part, we use three kinds of decoders to train different models, including CTC,1D attention,2D attention.Specially, the prediction results of these models are ensembled together.
4) Besides, some data augmentation methods and other tricks are used in this work.
Date | Method | Total Edit distance (case sensitive) | Correctly Recognised Words (case sensitive) | T.E.D. (case insensitive) | C.R.W. (case insensitive) | |||
---|---|---|---|---|---|---|---|---|
2022-10-31 | Shopee MMU OCR | 3,537.9001 | 43.29% | 746.5984 | 78.21% | |||
2019-09-05 | SogouMM | 3,496.3121 | 44.64% | 1,037.2197 | 77.97% | |||
2019-06-22 | SenseTime-CKD | 4,054.8236 | 41.52% | 824.6449 | 77.22% | |||
2017-07-01 | HIK_OCR | 3,661.5785 | 41.72% | 899.1009 | 76.11% | |||
2019-08-19 | MASTER-Ping An Property & Casualty Insurance Co | 3,272.0810 | 49.09% | 1,203.4201 | 71.33% | |||
2022-02-23 | Singularity Systems Inc OCR | 4,410.0816 | 38.87% | 1,326.4073 | 70.94% | |||
2017-06-30 | Tencent-DPPR Team & USTB-PRIR | 4,022.1224 | 36.91% | 1,233.4609 | 70.83% | |||
2019-02-25 | CLOVA-AI | 3,594.4842 | 47.35% | 1,583.7724 | 69.27% | |||
2021-05-14 | ustc_pr316 | 3,655.9347 | 43.39% | 1,311.9272 | 68.44% | |||
2018-12-19 | SAR | 4,002.3563 | 41.27% | 1,528.7396 | 66.85% | |||
2019-10-09 | Attention-OCR | 4,320.1734 | 37.87% | 1,251.9841 | 66.73% | |||
2017-06-30 | HKU-VisionLab | 3,921.9388 | 40.17% | 1,903.3725 | 59.29% | |||
2017-06-30 | BRTRS-Recognition | 4,895.9593 | 28.18% | 2,282.4888 | 59.25% | |||
2019-07-03 | Advanced Readotron | 4,698.4158 | 34.52% | 1,998.2297 | 58.03% | |||
2019-09-04 | juxinli | 5,544.6025 | 28.14% | 3,169.8610 | 45.41% | |||
2017-06-29 | CCFLAB | 4,743.2752 | 26.52% | 2,982.6609 | 42.66% | |||
2017-10-06 | CRNN - Sravya | 5,704.5379 | 24.26% | 3,532.9616 | 36.98% | |||
2017-06-30 | 3CNN_2BiLSTM_CTC | 6,405.6129 | 12.19% | 4,395.4174 | 30.17% | |||
2017-06-30 | Enhancing Text Recognition Accuracy by Adding External Language Model | 7,231.8718 | 17.88% | 5,555.8922 | 29.69% | |||
2017-06-28 | LSTM based text recognition | 6,594.0069 | 10.11% | 4,638.8345 | 26.25% |