- Task 1 - E2E Complex Entity Linking
- Task 2 - E2E Complex Entity Labeling
- Task 3 - E2E Zero-shot Structured Text Extraction
- Task 4 - E2E Few-shot Structured Text Extraction
method: OpenDoc(single model)2023-10-09
Authors: Huan Chen, Ya Guo, Yi Tu, Jinyang Tang, Chong Zhang, Huijia Zhu
Affiliation: Ant Group
Email: chenhuan.chen@antgroup.com
Description: 1. We connect a LayoutMask-0.1b encoder with AntGLM-10b decoder by a linear projection
2. We utilize a union strategy from two ocr results according to iou
Tu, Yi, et al. "LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding." arXiv preprint arXiv:2305.18721 (2023).
Du, Zhengxiao, et al. "Glm: General language model pretraining with autoregressive blank infilling." arXiv preprint arXiv:2103.10360 (2021).
method: sample-12023-03-20
Authors: Zhenrong Zhang, Lei Jiang, Youhui Guo, Jianshu Zhang, Jun Du
Affiliation: University of Science and Technology of China (USTC), iFLYTEK AI Research
Email: zzr666@mail.ustc.edu.cn
Description: 1. We use the UniLM[2] and LiLT[3] as decoder to utilize text and layout information, OCR results with manual-rule sorting are fed into decoder to predict target.
2. We assemble DocPrompt[1], UniLM[2] and LiLT[3].
[1] https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/ernie-layout/README_ch.md [2] https://github.com/microsoft/unilm/blob/master/s2s-ft/
method: LayoutLMv32023-03-14
Authors: Minhui Wu(伍敏慧),Mei Jiang(姜媚),Chen Li(李琛),Jing Lv(吕静),Huiwen Shi(石惠文)
Affiliation: TencentOCR
Description: Based on a large pretrained model and LayoutLM v3 architecture, with some pre/post processing methods.
Date | Method | score | score1 | score2 | |||
---|---|---|---|---|---|---|---|
2023-10-09 | OpenDoc(single model) | 78.98% | 82.69% | 64.15% | |||
2023-03-20 | sample-1 | 78.71% | 82.07% | 65.27% | |||
2023-03-14 | LayoutLMv3 | 77.35% | 80.01% | 66.71% | |||
2023-03-13 | LayoutLMv3 | 76.90% | 79.58% | 66.20% | |||
2023-03-17 | KIE-Brain3 | 71.44% | 74.90% | 57.59% | |||
2023-03-17 | KIE-Brainer2 | 71.24% | 74.82% | 56.92% | |||
2023-03-17 | KIE-Brain | 71.24% | 74.87% | 56.69% | |||
2023-03-16 | zero-shot-qa | 70.75% | 74.24% | 56.81% | |||
2023-03-15 | zero shot qa | 68.23% | 71.89% | 53.60% | |||
2023-03-17 | task3-2 | 62.59% | 65.52% | 50.85% | |||
2023-03-17 | task3_1 | 56.11% | 58.31% | 47.33% | |||
2023-03-17 | task3_0 | 47.39% | 49.16% | 40.29% | |||
2023-03-13 | task3_base | 43.70% | 46.09% | 34.15% | |||
2023-03-10 | test | 2.03% | 2.30% | 0.97% | |||
2023-03-13 | Donut_VIE | 1.37% | 1.47% | 1.01% | |||
2023-03-13 | first commit | 0.00% | 0.00% | 0.00% |