method: GraphDoc+Classify+Merge2023-05-24

Authors: Yan Wang, Jiefeng Ma, Zhenrong Zhang, Pengfei Hu, Jianshu Zhang, Jun Du

Affiliation: University of Science and Technology of China (USTC), iFLYTEK AI Research

Description: We pre-trained several GraphDoc models on provided unlabelled documents under different configurations. We then fine-tuned the models on the training set for 200-500 epochs. After classifying OCR boxes into various categories, we proposed a Merger module to handle the aggregation process.
We also used some pre/post-processing according to the text content and distances between OCR boxes. Finally, we adopted model ensembling to further enhance the system performance.


Affiliation: UIT - University of Information Technology - VNUHCM


Description: Our approach is based on the checkpoint baseline with some improvements. We trained/used models:
1. Model RoBERTa base from scratch using FGM and Lion Optimizer with synthetic data for 30 epochs, after that, I trained on annotated data.
2. Model RoBERTa ours (checkpoint) with Lion Optimizer
3. Model RoBERTa base (checkpoint)

After that, we ensemble them by unioning words that are marked at 1 of 55 field type, post-processing.
After that, we used the ensembled model to predict unlabeled data, we have pseudo data, use them to pre-train 3 models, and train on annotated data after that.


Authors: Organizers

Affiliation:, Czech Technical University in Prague, University of La Rochelle

Description: Baseline method. Uses multi-label NER formulation with RoBERTa base as the backbone. It is pre-trained on the synthetic part of the DocILE dataset.

Ranking Table

Description Paper Source Code
2023-05-02baseline - RoBERTa-base with synthetic pre-training33.43%51.19%50.78%51.61%
2023-05-02baseline - RoBERTa-base33.06%51.00%50.52%51.48%
2023-05-02baseline - LayoutLMv3 with unsupervised and synthetic pre-training31.68%50.10%50.66%49.55%
2023-05-02baseline - LayoutLMv3 with unsupervised pre-training31.37%49.03%48.82%49.25%
2023-05-25SRCB Submission on Key Information Localization and Extraction7.27%26.86%27.20%26.53%

Ranking Graphic

Ranking Graphic