method: StrucTexT2021-11-24

Authors: Baidu-OCR

Affiliation: Baidu

Description: 1. StrucTexT is a joint segment-level and token-level representation enhancement model for document image understanding, such as pdf, invoice, receipt and so on.
2. Using 50 million Chinese and English document images for the StrucTexT large model pre-training.
3. We finetune the single large pretrain-model on the SROIE dataset.

method: GraphDoc2022-03-18

Authors: Zhenrong Zhang, Jiefeng Ma, Jun Du

Affiliation: National Engineering Research Center of Speech and Language Information Processing (NERC-SLIP), University of Science and Technology of China.

Email: zzr666@mail.ustc.edu.cn

Description: 1. GraphDoc is a multi-modal graph attention-based model for various Document Understanding tasks.
2. GraphDoc is pretrained on the RVL-CDIP training dataset, which contains only 320k document images.
4. Following the same evaluation rules as others, the OCR mismatch errors are excluded in the submission.

method: LAKE2022-01-21

Authors: LAKE

Description: 1. Document-Parser provides enhanced layout information.
2. Apply such information for layout enhanced pre-training based on Roberta.

Ranking Table

Description Paper Source Code
DateMethodRecallPrecisionHmean
2021-11-24StrucTexT98.70%98.70%98.70%
2022-03-18GraphDoc98.13%98.77%98.45%
2022-01-21LAKE97.26%99.48%98.36%
2022-04-15Character-Aware CNN + Highway + BiLSTM 2.098.20%98.48%98.34%
2021-04-19IE97.05%99.56%98.29%
2021-07-20 Linklogis_BeeAI97.05%99.34%98.18%
2021-01-02Applica.ai Lambert 2.0 + Excluding OCR Errors + Fixing total entity96.83%99.56%98.17%
2021-06-02Multimodal Transformer for Information Extraction96.76%99.56%98.14%
2021-02-16Applica.ai TILT + Excluding OCR Errors + Fixing total entity96.83%99.41%98.10%
2020-12-24LayoutLM 2.0 (single model)96.61%99.04%97.81%
2021-01-01Applica.ai Lambert 2.0 + Excluding OCR Mismatch96.40%99.11%97.74%
2020-12-07Tencent Youtu96.47%98.89%97.67%
2020-12-28IE method96.33%98.53%97.41%
2020-05-07HIK_OCR_Exclude_ocr_mismatch96.33%98.38%97.34%
2020-04-18LayoutLM + Excluding OCR Mismatch96.04%98.16%97.09%
2021-10-25Character-Aware CNN + Highway + BiLSTM 1.096.18%97.45%96.81%
2022-07-19GraphRevisedIE96.04%96.80%96.42%
2020-04-15PICK-PAPCIC & XZMU95.46%96.79%96.12%
2020-04-16LayoutLM96.04%96.04%96.04%
2020-03-26Applica.ai roberta-base-2D95.39%95.80%95.60%
2021-12-081208-cblm94.81%95.22%95.02%
2021-12-081208-cblm94.24%94.65%94.44%
2022-02-28CHL94.24%94.65%94.44%
2020-06-05great94.24%94.24%94.24%
2019-08-14PATech_AICenter94.02%94.02%94.02%
2021-06-11IE + GCN+ OCR92.44%94.90%93.65%
2021-02-21RoBERTa-base finetuned on business documents92.80%93.27%93.03%
2021-06-13GEM AI -OCR TEAM92.72%92.72%92.72%
2021-10-25layoutlm V291.71%93.12%92.41%
2021-02-21RoBERTa-base92.22%92.55%92.39%
2020-05-23GIE91.21%93.43%92.31%
2021-09-08test excluding ocr error91.93%91.93%91.93%
2020-07-07Taikang Insurance Group Research Institute91.79%91.99%91.89%
2021-06-30N91.79%91.79%91.79%
2019-08-05PATECH_CHENGDU_OCR_V291.21%91.21%91.21%
2020-02-20Character & Word BiLSTM Encoder90.85%90.85%90.85%
2019-05-05Ping An Property & Casualty Insurance Company90.49%90.49%90.49%
2019-04-29Enetity detection89.70%89.70%89.70%
2019-05-04H&H Lab89.63%89.63%89.63%
2019-05-02CLOVA OCR89.05%89.05%89.05%
2021-12-061129-clm88.76%89.08%88.92%
2021-11-10LSTM without OCRerror and RM86.46%86.46%86.46%
2020-12-29coldog86.17%86.17%86.17%
2019-09-23ASTRI-CCT-MSA85.45%85.45%85.45%
2019-05-05GraphLayout85.09%85.09%85.09%
2021-03-02Qubitrics82.06%86.75%84.34%
2020-06-15End-to-end learning with PGN83.86%83.86%83.86%
2019-05-04HeReceipt-withoutRM83.00%83.24%83.12%
2020-06-17Graph Neural Net with Bert Embeddings82.78%82.78%82.78%
2019-05-06BOE_IOT_AIBD82.71%82.71%82.71%
2019-05-05PATECH_CHENGDU_OCR81.70%82.29%82.00%
2020-05-28SROIE LSTM - Axel Alejandro Ramos GarcĂ­a81.99%81.99%81.99%
2020-04-28BERT-MRC81.05%81.05%81.05%
2021-03-09Character Level BiLSTM79.25%79.25%79.25%
2020-05-29Cool Method Remix79.03%79.03%79.03%
2019-04-30NER with spaCy model78.96%79.02%78.99%
2021-07-05test278.89%78.89%78.89%
2020-12-28Custom Named Entity Recognition77.59%77.59%77.59%
2019-05-05CITlab Argus Information Extraction (positional & line features, enhanced gt)77.38%77.38%77.38%
2021-01-02lstm deep77.38%77.38%77.38%
2022-01-25PIC-proj76.95%76.95%76.95%
2021-01-02lstm standard method trained 100 epochs constant learning rate76.15%76.15%76.15%
2019-04-28A Simple Method for Key Information Extraction as Character-wise Classification with LSTM75.58%75.58%75.58%
2019-04-30Bi-directional LSTM-CNNs-CRF (version2)74.86%74.86%74.86%
2019-05-05Location-aware BERT model for Text Information Extraction74.42%74.42%74.42%
2024-07-26loixc-vqa73.92%73.92%73.92%
2020-05-23test73.63%73.63%73.63%
2023-12-045shout71.18%71.18%71.18%
2021-04-07Token level multi modal bilstm70.03%70.03%70.03%
2021-04-07Token level bert embed + bilstm66.43%66.43%66.43%
2019-04-30BERT with Multi-task Confidence Prediction66.14%66.14%66.14%
2023-12-030 shots65.92%65.92%65.92%
2023-12-032shot_tem0264.27%64.27%64.27%
2019-05-02With receipt framing63.04%63.54%63.29%
2021-09-24layoutLMtest61.38%61.38%61.38%
2019-05-05IFLYTEK-textNLP_v261.24%61.24%61.24%
2019-05-05SituTech_OCR59.01%62.38%60.64%
2024-07-31tixc-vqa57.93%57.93%57.93%
2022-01-30LayoutLMv236.24%36.24%36.24%
2023-12-171shot_position32.71%32.71%32.71%
2019-04-30Key Information Extraction from Scanned Receipts28.75%36.31%32.09%
2022-02-01Pytesseract + Character Level LSTM + Regex for Dates26.95%26.95%26.95%

Ranking Graphic