method: GraphDoc+Classify+Merge2023-05-25

Authors: Yan Wang, Jiefeng Ma, Zhenrong Zhang, Pengfei Hu, Jianshu Zhang, Jun Du

Affiliation: University of Science and Technology of China (USTC), iFLYTEK AI Research

Description: We pre-trained several GraphDoc models on provided unlabelled documents under different configurations. We then fine-tuned the models on the training set for 500-1000 epochs. After classifying OCR boxes into various categories, we proposed a Merger module to handle the aggregation process.
We also used some pre/post-processing according to the text content and distances between OCR boxes. Finally, we adopted model ensembling to further enhance the system performance.

Authors: Organizers

Affiliation: Rossum.ai, Czech Technical University in Prague, University of La Rochelle

Description: Baseline method. Uses multi-label NER formulation with RoBERTa base as the backbone. It is pre-trained on the synthetic part of the DocILE dataset.

Authors: Organizers

Affiliation: Rossum.ai, Czech Technical University in Prague, University of La Rochelle

Description: Baseline method. Uses multi-label NER formulation with LayoutLMv3 as the backbone. It is pre-trained on the unlabelled and synthetic parts of the DocILE dataset.

Ranking Table

Description Paper Source Code
DateMethodF1APPrecisionRecall
2023-05-25GraphDoc+Classify+Merge75.93%57.89%80.82%71.60%
2023-05-02baseline - RoBERTa-base with synthetic pre-training69.82%58.28%70.98%68.71%
2023-05-02baseline - LayoutLMv3 with unsupervised and synthetic pre-training69.06%58.23%70.95%67.27%
2023-05-02baseline - RoBERTa-base68.64%57.63%69.46%67.84%
2023-05-02baseline - LayoutLMv3 with unsupervised pre-training66.12%53.12%68.22%64.14%
2023-05-24YOLOv8X+Grid59.66%38.28%59.86%59.47%
2023-05-25SRCB Submission on Line Item Recognition41.32%17.44%43.27%39.53%

Ranking Graphic

Ranking Graphic