method: StrucTexT2021-11-24
Authors: Baidu-OCR
Affiliation: Baidu
Description: 1. StrucTexT is a joint segment-level and token-level representation enhancement model for document image understanding, such as pdf, invoice, receipt and so on.
2. Using 50 million Chinese and English document images for the StrucTexT large model pre-training.
3. We finetune the single large pretrain-model on the SROIE dataset.
method: GraphDoc2022-03-18
Authors: Zhenrong Zhang, Jiefeng Ma, Jun Du
Affiliation: National Engineering Research Center of Speech and Language Information Processing (NERC-SLIP), University of Science and Technology of China.
Email: zzr666@mail.ustc.edu.cn
Description: 1. GraphDoc is a multi-modal graph attention-based model for various Document Understanding tasks.
2. GraphDoc is pretrained on the RVL-CDIP training dataset, which contains only 320k document images.
4. Following the same evaluation rules as others, the OCR mismatch errors are excluded in the submission.
method: LAKE2022-01-21
Authors: LAKE
Description: 1. Document-Parser provides enhanced layout information.
2. Apply such information for layout enhanced pre-training based on Roberta.
Date | Method | Recall | Precision | Hmean | |||
---|---|---|---|---|---|---|---|
2021-11-24 | StrucTexT | 98.70% | 98.70% | 98.70% | |||
2022-03-18 | GraphDoc | 98.13% | 98.77% | 98.45% | |||
2022-01-21 | LAKE | 97.26% | 99.48% | 98.36% | |||
2022-04-15 | Character-Aware CNN + Highway + BiLSTM 2.0 | 98.20% | 98.48% | 98.34% | |||
2021-04-19 | IE | 97.05% | 99.56% | 98.29% | |||
2021-07-20 | Linklogis_BeeAI | 97.05% | 99.34% | 98.18% | |||
2021-01-02 | Applica.ai Lambert 2.0 + Excluding OCR Errors + Fixing total entity | 96.83% | 99.56% | 98.17% | |||
2021-06-02 | Multimodal Transformer for Information Extraction | 96.76% | 99.56% | 98.14% | |||
2021-02-16 | Applica.ai TILT + Excluding OCR Errors + Fixing total entity | 96.83% | 99.41% | 98.10% | |||
2020-12-24 | LayoutLM 2.0 (single model) | 96.61% | 99.04% | 97.81% | |||
2021-01-01 | Applica.ai Lambert 2.0 + Excluding OCR Mismatch | 96.40% | 99.11% | 97.74% | |||
2020-12-07 | Tencent Youtu | 96.47% | 98.89% | 97.67% | |||
2020-12-28 | IE method | 96.33% | 98.53% | 97.41% | |||
2020-05-07 | HIK_OCR_Exclude_ocr_mismatch | 96.33% | 98.38% | 97.34% | |||
2020-04-18 | LayoutLM + Excluding OCR Mismatch | 96.04% | 98.16% | 97.09% | |||
2021-10-25 | Character-Aware CNN + Highway + BiLSTM 1.0 | 96.18% | 97.45% | 96.81% | |||
2022-07-19 | GraphRevisedIE | 96.04% | 96.80% | 96.42% | |||
2020-04-15 | PICK-PAPCIC & XZMU | 95.46% | 96.79% | 96.12% | |||
2020-04-16 | LayoutLM | 96.04% | 96.04% | 96.04% | |||
2020-03-26 | Applica.ai roberta-base-2D | 95.39% | 95.80% | 95.60% | |||
2021-12-08 | 1208-cblm | 94.81% | 95.22% | 95.02% | |||
2021-12-08 | 1208-cblm | 94.24% | 94.65% | 94.44% | |||
2022-02-28 | CHL | 94.24% | 94.65% | 94.44% | |||
2020-06-05 | great | 94.24% | 94.24% | 94.24% | |||
2019-08-14 | PATech_AICenter | 94.02% | 94.02% | 94.02% | |||
2021-06-11 | IE + GCN+ OCR | 92.44% | 94.90% | 93.65% | |||
2021-02-21 | RoBERTa-base finetuned on business documents | 92.80% | 93.27% | 93.03% | |||
2021-06-13 | GEM AI -OCR TEAM | 92.72% | 92.72% | 92.72% | |||
2021-10-25 | layoutlm V2 | 91.71% | 93.12% | 92.41% | |||
2021-02-21 | RoBERTa-base | 92.22% | 92.55% | 92.39% | |||
2020-05-23 | GIE | 91.21% | 93.43% | 92.31% | |||
2021-09-08 | test excluding ocr error | 91.93% | 91.93% | 91.93% | |||
2020-07-07 | Taikang Insurance Group Research Institute | 91.79% | 91.99% | 91.89% | |||
2021-06-30 | N | 91.79% | 91.79% | 91.79% | |||
2019-08-05 | PATECH_CHENGDU_OCR_V2 | 91.21% | 91.21% | 91.21% | |||
2020-02-20 | Character & Word BiLSTM Encoder | 90.85% | 90.85% | 90.85% | |||
2019-05-05 | Ping An Property & Casualty Insurance Company | 90.49% | 90.49% | 90.49% | |||
2019-04-29 | Enetity detection | 89.70% | 89.70% | 89.70% | |||
2019-05-04 | H&H Lab | 89.63% | 89.63% | 89.63% | |||
2019-05-02 | CLOVA OCR | 89.05% | 89.05% | 89.05% | |||
2021-12-06 | 1129-clm | 88.76% | 89.08% | 88.92% | |||
2021-11-10 | LSTM without OCRerror and RM | 86.46% | 86.46% | 86.46% | |||
2020-12-29 | coldog | 86.17% | 86.17% | 86.17% | |||
2019-09-23 | ASTRI-CCT-MSA | 85.45% | 85.45% | 85.45% | |||
2019-05-05 | GraphLayout | 85.09% | 85.09% | 85.09% | |||
2021-03-02 | Qubitrics | 82.06% | 86.75% | 84.34% | |||
2020-06-15 | End-to-end learning with PGN | 83.86% | 83.86% | 83.86% | |||
2019-05-04 | HeReceipt-withoutRM | 83.00% | 83.24% | 83.12% | |||
2020-06-17 | Graph Neural Net with Bert Embeddings | 82.78% | 82.78% | 82.78% | |||
2019-05-06 | BOE_IOT_AIBD | 82.71% | 82.71% | 82.71% | |||
2019-05-05 | PATECH_CHENGDU_OCR | 81.70% | 82.29% | 82.00% | |||
2020-05-28 | SROIE LSTM - Axel Alejandro Ramos GarcĂa | 81.99% | 81.99% | 81.99% | |||
2020-04-28 | BERT-MRC | 81.05% | 81.05% | 81.05% | |||
2021-03-09 | Character Level BiLSTM | 79.25% | 79.25% | 79.25% | |||
2020-05-29 | Cool Method Remix | 79.03% | 79.03% | 79.03% | |||
2019-04-30 | NER with spaCy model | 78.96% | 79.02% | 78.99% | |||
2021-07-05 | test2 | 78.89% | 78.89% | 78.89% | |||
2020-12-28 | Custom Named Entity Recognition | 77.59% | 77.59% | 77.59% | |||
2019-05-05 | CITlab Argus Information Extraction (positional & line features, enhanced gt) | 77.38% | 77.38% | 77.38% | |||
2021-01-02 | lstm deep | 77.38% | 77.38% | 77.38% | |||
2022-01-25 | PIC-proj | 76.95% | 76.95% | 76.95% | |||
2021-01-02 | lstm standard method trained 100 epochs constant learning rate | 76.15% | 76.15% | 76.15% | |||
2019-04-28 | A Simple Method for Key Information Extraction as Character-wise Classification with LSTM | 75.58% | 75.58% | 75.58% | |||
2019-04-30 | Bi-directional LSTM-CNNs-CRF (version2) | 74.86% | 74.86% | 74.86% | |||
2019-05-05 | Location-aware BERT model for Text Information Extraction | 74.42% | 74.42% | 74.42% | |||
2024-07-26 | loixc-vqa | 73.92% | 73.92% | 73.92% | |||
2020-05-23 | test | 73.63% | 73.63% | 73.63% | |||
2023-12-04 | 5shout | 71.18% | 71.18% | 71.18% | |||
2021-04-07 | Token level multi modal bilstm | 70.03% | 70.03% | 70.03% | |||
2021-04-07 | Token level bert embed + bilstm | 66.43% | 66.43% | 66.43% | |||
2019-04-30 | BERT with Multi-task Confidence Prediction | 66.14% | 66.14% | 66.14% | |||
2023-12-03 | 0 shots | 65.92% | 65.92% | 65.92% | |||
2023-12-03 | 2shot_tem02 | 64.27% | 64.27% | 64.27% | |||
2019-05-02 | With receipt framing | 63.04% | 63.54% | 63.29% | |||
2021-09-24 | layoutLMtest | 61.38% | 61.38% | 61.38% | |||
2019-05-05 | IFLYTEK-textNLP_v2 | 61.24% | 61.24% | 61.24% | |||
2019-05-05 | SituTech_OCR | 59.01% | 62.38% | 60.64% | |||
2024-07-31 | tixc-vqa | 57.93% | 57.93% | 57.93% | |||
2022-01-30 | LayoutLMv2 | 36.24% | 36.24% | 36.24% | |||
2023-12-17 | 1shot_position | 32.71% | 32.71% | 32.71% | |||
2019-04-30 | Key Information Extraction from Scanned Receipts | 28.75% | 36.31% | 32.09% | |||
2022-02-01 | Pytesseract + Character Level LSTM + Regex for Dates | 26.95% | 26.95% | 26.95% |