method: GraphDoc+Classify+Merge2023-05-24
Authors: Yan Wang, Jiefeng Ma, Zhenrong Zhang, Pengfei Hu, Jianshu Zhang, Jun Du
Affiliation: University of Science and Technology of China (USTC), iFLYTEK AI Research
Description: We pre-trained several GraphDoc models on provided unlabelled documents under different configurations. We then fine-tuned the models on the training set for 200-500 epochs. After classifying OCR boxes into various categories, we proposed a Merger module to handle the aggregation process.
We also used some pre/post-processing according to the text content and distances between OCR boxes. Finally, we adopted model ensembling to further enhance the system performance.
method: Baseline+Ensemble+Pseudo+Post-Processing2023-05-16
Authors: UIT@AICLUB_TAB
Affiliation: UIT - University of Information Technology - VNUHCM
Email: 22520121@gm.uit.edu.vn
Description: Our approach is based on the checkpoint baseline with some improvements. We trained/used models:
1. Model RoBERTa base from scratch using FGM and Lion Optimizer with synthetic data for 30 epochs, after that, I trained on annotated data.
2. Model RoBERTa ours (checkpoint) with Lion Optimizer
3. Model RoBERTa base (checkpoint)
After that, we ensemble them by unioning words that are marked at 1 of 55 field type, post-processing.
After that, we used the ensembled model to predict unlabeled data, we have pseudo data, use them to pre-train 3 models, and train on annotated data after that.
Pipeline: https://ibb.co/4MWcXgb
method: baseline - RoBERTa-base with synthetic pre-training2023-05-02
Authors: Organizers
Affiliation: Rossum.ai, Czech Technical University in Prague, University of La Rochelle
Description: Baseline method. Uses multi-label NER formulation with RoBERTa base as the backbone. It is pre-trained on the synthetic part of the DocILE dataset.
Date | Method | AP | F1 | Precision | Recall | |||
---|---|---|---|---|---|---|---|---|
2023-05-24 | GraphDoc+Classify+Merge | 37.20% | 53.14% | 51.11% | 55.34% | |||
2023-05-16 | Baseline+Ensemble+Pseudo+Post-Processing | 36.10% | 47.14% | 44.18% | 50.52% | |||
2023-05-02 | baseline - RoBERTa-base with synthetic pre-training | 33.43% | 51.19% | 50.78% | 51.61% | |||
2023-05-02 | baseline - RoBERTa-base | 33.06% | 51.00% | 50.52% | 51.48% | |||
2023-05-02 | baseline - LayoutLMv3 with unsupervised and synthetic pre-training | 31.68% | 50.10% | 50.66% | 49.55% | |||
2023-05-02 | baseline - LayoutLMv3 with unsupervised pre-training | 31.37% | 49.03% | 48.82% | 49.25% | |||
2023-05-25 | SRCB Submission on Key Information Localization and Extraction | 7.27% | 26.86% | 27.20% | 26.53% | |||
2023-05-08 | YOLOv8X+Grid | 0.00% | 0.00% | 0.00% | 0.00% |