- Task 2 - E2E Complex Entity Labeling - Method: multi-modal based KIE using LayoutLMv3
- Method info
- Samples list
- Per sample details
method: multi-modal based KIE using LayoutLMv32023-03-17
Authors: Jie Li,Wei Wang,Min Xu, Yiru Zhao,Bin Zhang,Pengyu Chen,Danya Zhou,Yuqi Zhang,Ruixue Zhang,Di Wang,Hui Wang,Chao Li,Shiyu Hu,Dong Xiang,Songtao Li,Yunxin Yang
Affiliation: SPDB LAB
Email: 18206291823@163.com
Description: Our approach for document information extraction is based on fine-tuning LayoutLMv3, a pre-trained model for document analysis and recognition. We used the general-purpose LayoutLMv3 model as the foundation and fine-tuned it on the competition data. To address the long-tail and imbalanced distribution of the task 2 competition data, we synthesized additional data for minority categories. In post-processing, we sorted the text for each category according to its reading order. Our method achieved an F1 score of approximately 0.85 on the validation set, demonstrating its effectiveness in extracting information from various document types.
@misc{2204.08387, Author = {Yupan Huang and Tengchao Lv and Lei Cui and Yutong Lu and Furu Wei}, Title = {LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking}, Year = {2022}, Eprint = {arXiv:2204.08387}, }