- Task 1 - Single Page Document VQA
- Task 2 - Document Collection VQA
- Task 3 - Infographics VQA
- Task 4 - MP-DocVQA
method: Human Performance2020-06-13
Authors: DocVQA Organizers
Affiliation: CVIT, IIIT Hyderabad, CVC-UAB, Amazon
Description: Human performance on the test set.
A small group of volunteers were asked to enter an answer for the given question and the image.
method: Seed-VL-1.52025-05-13
Authors: Seed-VL
Affiliation: ByteDance
Description: Seed-VL-1.5
@misc{guo2025seed15vltechnicalreport, title={Seed1.5-VL Technical Report}, author={Dong Guo and Faming Wu and Feida Zhu and Fuxing Leng and Guang Shi and Haobin Chen and Haoqi Fan and Jian Wang and Jianyu Jiang and Jiawei Wang and Jingji Chen and Jingjia Huang and Kang Lei and Liping Yuan and Lishu Luo and Pengfei Liu and Qinghao Ye and Rui Qian and Shen Yan and Shixiong Zhao and Shuai Peng and Shuangye Li and Sihang Yuan and Sijin Wu and Tianheng Cheng and Weiwei Liu and Wenqian Wang and Xianhan Zeng and Xiao Liu and Xiaobo Qin and Xiaohan Ding and Xiaojun Xiao and Xiaoying Zhang and Xuanwei Zhang and Xuehan Xiong and Yanghua Peng and Yangrui Chen and Yanwei Li and Yanxu Hu and Yi Lin and Yiyuan Hu and Yiyuan Zhang and Youbin Wu and Yu Li and Yudong Liu and Yue Ling and Yujia Qin and Zanbo Wang and Zhiwu He and Aoxue Zhang and Bairen Yi and Bencheng Liao and Can Huang and Can Zhang and Chaorui Deng and Chaoyi Deng and Cheng Lin and Cheng Yuan and Chenggang Li and Chenhui Gou and Chenwei Lou and Chengzhi Wei and Chundian Liu and Chunyuan Li and Deyao Zhu and Donghong Zhong and Feng Li and Feng Zhang and Gang Wu and Guodong Li and Guohong Xiao and Haibin Lin and Haihua Yang and Haoming Wang and Heng Ji and Hongxiang Hao and Hui Shen and Huixia Li and Jiahao Li and Jialong Wu and Jianhua Zhu and Jianpeng Jiao and Jiashi Feng and Jiaze Chen and Jianhui Duan and Jihao Liu and Jin Zeng and Jingqun Tang and Jingyu Sun and Joya Chen and Jun Long and Junda Feng and Junfeng Zhan and Junjie Fang and Junting Lu and Kai Hua and Kai Liu and Kai Shen and Kaiyuan Zhang and Ke Shen and Ke Wang and Keyu Pan and Kun Zhang and Kunchang Li and Lanxin Li and Lei Li and Lei Shi and Li Han and Liang Xiang and Liangqiang Chen and Lin Chen and Lin Li and Lin Yan and Liying Chi and Longxiang Liu and Mengfei Du and Mingxuan Wang and Ningxin Pan and Peibin Chen and Pengfei Chen and Pengfei Wu and Qingqing Yuan and Qingyao Shuai and Qiuyan Tao and Renjie Zheng and Renrui Zhang and Ru Zhang and Rui Wang and Rui Yang and Rui Zhao and Shaoqiang Xu and Shihao Liang and Shipeng Yan and Shu Zhong and Shuaishuai Cao and Shuangzhi Wu and Shufan Liu and Shuhan Chang and Songhua Cai and Tenglong Ao and Tianhao Yang and Tingting Zhang and Wanjun Zhong and Wei Jia and Wei Weng and Weihao Yu and Wenhao Huang and Wenjia Zhu and Wenli Yang and Wenzhi Wang and Xiang Long and XiangRui Yin and Xiao Li and Xiaolei Zhu and Xiaoying Jia and Xijin Zhang and Xin Liu and Xinchen Zhang and Xinyu Yang and Xiongcai Luo and Xiuli Chen and Xuantong Zhong and Xuefeng Xiao and Xujing Li and Yan Wu and Yawei Wen and Yifan Du and Yihao Zhang and Yining Ye and Yonghui Wu and Yu Liu and Yu Yue and Yufeng Zhou and Yufeng Yuan and Yuhang Xu and Yuhong Yang and Yun Zhang and Yunhao Fang and Yuntao Li and Yurui Ren and Yuwen Xiong and Zehua Hong and Zehua Wang and Zewei Sun and Zeyu Wang and Zhao Cai and Zhaoyue Zha and Zhecheng An and Zhehui Zhao and Zhengzhuo Xu and Zhipeng Chen and Zhiyong Wu and Zhuofan Zheng and Zihao Wang and Zilong Huang and Ziyu Zhu and Zuquan Song}, year={2025}, eprint={2505.07062}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2505.07062}, }
method: qwen2-vl2024-07-11
Authors: qwen team
Affiliation: alibaba group
Description: qwen2-vl
Date | Method | Score | Figure/Diagram | Form | Table/List | Layout | Free_text | Image/Photo | Handwritten | Yes/No | Others | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2020-06-13 | Human Performance | 0.9811 | 0.9756 | 0.9825 | 0.9780 | 0.9845 | 0.9839 | 0.9740 | 0.9717 | 0.9974 | 0.9828 | |||
2025-05-13 | Seed-VL-1.5 | 0.9691 | 0.9447 | 0.9815 | 0.9764 | 0.9674 | 0.9582 | 0.9162 | 0.9522 | 1.0000 | 0.9464 | |||
2024-07-11 | qwen2-vl | 0.9670 | 0.9206 | 0.9816 | 0.9703 | 0.9678 | 0.9619 | 0.9135 | 0.9436 | 0.9655 | 0.9540 | |||
2024-06-30 | InternVL2-Pro (generalist) | 0.9506 | 0.8888 | 0.9714 | 0.9486 | 0.9582 | 0.9446 | 0.8909 | 0.9278 | 0.9655 | 0.9410 | |||
2025-01-16 | VideoLLaMA3-7B | 0.9494 | 0.8843 | 0.9692 | 0.9498 | 0.9532 | 0.9427 | 0.8838 | 0.9293 | 0.9310 | 0.9313 | |||
2025-04-03 | test | 0.9406 | 0.8864 | 0.9603 | 0.9432 | 0.9417 | 0.9189 | 0.8639 | 0.9153 | 0.8966 | 0.9231 | |||
2024-09-25 | Molmo-72B | 0.9351 | 0.8822 | 0.9548 | 0.9387 | 0.9413 | 0.9100 | 0.8688 | 0.9196 | 0.9195 | 0.9229 | |||
2025-02-26 | Qwen2.5-3B-lite | 0.9342 | 0.8807 | 0.9622 | 0.9316 | 0.9397 | 0.9128 | 0.9022 | 0.9234 | 0.7931 | 0.8887 | |||
2024-12-13 | DeepSeek-VL2 | 0.9330 | 0.8853 | 0.9575 | 0.9364 | 0.9309 | 0.9214 | 0.8685 | 0.8988 | 0.8966 | 0.9008 | |||
2024-01-24 | qwenvl-max (single generalist model) | 0.9307 | 0.8491 | 0.9474 | 0.9195 | 0.9403 | 0.9380 | 0.8652 | 0.8922 | 0.8621 | 0.9341 | |||
2024-05-10 | Vary (using multi crop) | 0.9241 | 0.8926 | 0.9372 | 0.8953 | 0.9405 | 0.9447 | 0.9035 | 0.9335 | 0.8739 | 0.9478 | |||
2024-04-27 | InternVL-1.5-Plus (generalist) | 0.9234 | 0.8354 | 0.9556 | 0.9123 | 0.9397 | 0.9032 | 0.8313 | 0.9064 | 0.9655 | 0.9098 | |||
2024-11-01 | MLCD-Embodied-7B: Multi-label Cluster Discrimination for Visual Representation Learning | 0.9158 | 0.8286 | 0.9315 | 0.9131 | 0.9289 | 0.9088 | 0.7804 | 0.8300 | 0.8897 | 0.8796 | |||
2023-12-07 | qwenvl-plus (single generalist model) | 0.9141 | 0.8146 | 0.9464 | 0.8999 | 0.9277 | 0.9265 | 0.8419 | 0.8776 | 0.9310 | 0.8667 | |||
2023-11-15 | SMoLA-PaLI-X Specialist Model | 0.9084 | 0.7790 | 0.9416 | 0.8934 | 0.9262 | 0.9188 | 0.7911 | 0.8508 | 0.8966 | 0.8456 | |||
2025-01-08 | PP-DocBee-2B | 0.9056 | 0.7998 | 0.9541 | 0.8910 | 0.9211 | 0.8800 | 0.8901 | 0.8911 | 0.7561 | 0.8893 | |||
2023-12-07 | SMoLA-PaLI-X Generalist Model | 0.9055 | 0.7757 | 0.9381 | 0.8924 | 0.9187 | 0.9179 | 0.8364 | 0.8483 | 0.7446 | 0.8609 | |||
2024-05-01 | Snowflake Arctic-TILT 0.8B (fine-tuned) | 0.9020 | 0.7198 | 0.9398 | 0.9152 | 0.9015 | 0.9042 | 0.6860 | 0.8415 | 0.6897 | 0.8604 | |||
2022-10-08 | BAIDU-DI | 0.9016 | 0.6823 | 0.9186 | 0.9139 | 0.9138 | 0.9234 | 0.6841 | 0.7949 | 0.6181 | 0.8344 | |||
2024-04-02 | InternLM-XComposer2-4KHD-7B | 0.9002 | 0.8041 | 0.9400 | 0.8965 | 0.9143 | 0.8618 | 0.7845 | 0.8264 | 0.8621 | 0.8298 | |||
2024-02-10 | ScreenAI 5B | 0.8988 | 0.7297 | 0.9419 | 0.8928 | 0.9158 | 0.8873 | 0.7722 | 0.8160 | 0.8966 | 0.8551 | |||
2024-05-01 | Snowflake Arctic-TILT 0.8B (zero-shot) | 0.8881 | 0.6826 | 0.9311 | 0.9011 | 0.8867 | 0.8917 | 0.6534 | 0.8219 | 0.6897 | 0.8515 | |||
2022-03-31 | Tencent Youtu | 0.8866 | 0.7576 | 0.9470 | 0.8932 | 0.8821 | 0.8654 | 0.6680 | 0.8877 | 0.4828 | 0.8413 | |||
2022-01-13 | ERNIE-Layout 2.0 | 0.8841 | 0.6434 | 0.9177 | 0.8996 | 0.8899 | 0.9010 | 0.6223 | 0.7836 | 0.6124 | 0.8118 | |||
2023-12-10 | DocFormerv2 (Single Model with 750M Parameters) | 0.8784 | 0.6680 | 0.9382 | 0.9076 | 0.8676 | 0.8555 | 0.5840 | 0.8123 | 0.8276 | 0.8070 | |||
2024-10-30 | BlueLM-V-3B | 0.8775 | 0.7652 | 0.9245 | 0.8659 | 0.9005 | 0.8372 | 0.8079 | 0.8276 | 0.7931 | 0.7734 | |||
2024-09-08 | neetolab-sota-v1 | 0.8759 | 0.7938 | 0.9209 | 0.8577 | 0.8946 | 0.8558 | 0.8011 | 0.8664 | 0.6207 | 0.8261 | |||
2021-11-26 | Mybank-DocReader | 0.8755 | 0.6682 | 0.9233 | 0.8763 | 0.8896 | 0.8713 | 0.6290 | 0.8047 | 0.5805 | 0.7804 | |||
2021-09-06 | ERNIE-Layout 1.0 | 0.8753 | 0.6586 | 0.8972 | 0.8864 | 0.8902 | 0.8943 | 0.6392 | 0.7331 | 0.5434 | 0.8115 | |||
2025-05-08 | PeKi_DocVQA | 0.8742 | 0.7651 | 0.9372 | 0.8513 | 0.8940 | 0.8356 | 0.7789 | 0.8507 | 0.8276 | 0.8147 | |||
2024-08-22 | Mini-Monkey | 0.8738 | 0.7334 | 0.9350 | 0.8493 | 0.9046 | 0.8383 | 0.7931 | 0.8262 | 0.6782 | 0.7628 | |||
2024-05-31 | GPT-4 Vision Turbo + Amazon Textract OCR | 0.8736 | 0.7346 | 0.9196 | 0.8756 | 0.8678 | 0.8709 | 0.8137 | 0.8681 | 0.8966 | 0.8464 | |||
2021-02-12 | Applica.ai TILT | 0.8705 | 0.6082 | 0.9459 | 0.8980 | 0.8592 | 0.8581 | 0.5508 | 0.8139 | 0.6897 | 0.7788 | |||
2023-05-31 | PaLI-X (Google Research; Single Generative Model) | 0.8679 | 0.6971 | 0.8992 | 0.8400 | 0.8955 | 0.8925 | 0.7589 | 0.7209 | 0.8966 | 0.8468 | |||
2020-12-22 | LayoutLM 2.0 (single model) | 0.8672 | 0.6574 | 0.8953 | 0.8769 | 0.8791 | 0.8707 | 0.7287 | 0.6729 | 0.5517 | 0.8103 | |||
2024-01-24 | nnrc_vary | 0.8631 | 0.6689 | 0.9174 | 0.8354 | 0.8876 | 0.8761 | 0.6891 | 0.8269 | 0.6207 | 0.7696 | |||
2023-12-10 | 54_nnrc_zephyr | 0.8560 | 0.6170 | 0.8924 | 0.8603 | 0.8546 | 0.9020 | 0.6083 | 0.8142 | 0.7488 | 0.8386 | |||
2020-08-16 | Alibaba DAMO NLP | 0.8506 | 0.6650 | 0.8809 | 0.8552 | 0.8733 | 0.8397 | 0.6758 | 0.7691 | 0.5492 | 0.7526 | |||
2020-05-16 | PingAn-OneConnect-Gammalab-DQA | 0.8484 | 0.6059 | 0.9021 | 0.8463 | 0.8730 | 0.8337 | 0.5812 | 0.7692 | 0.5172 | 0.7289 | |||
2024-05-01 | PaliGemma-3B (finetune, 896px) | 0.8477 | 0.6543 | 0.9252 | 0.8326 | 0.8733 | 0.8099 | 0.7382 | 0.8314 | 0.7931 | 0.7571 | |||
2024-01-21 | Spatial LLM v1.2 | 0.8443 | 0.6300 | 0.8917 | 0.8180 | 0.8644 | 0.8877 | 0.6106 | 0.7390 | 0.6897 | 0.8097 | |||
2023-02-21 | LayoutLMv2_star_seg_large | 0.8430 | 0.7008 | 0.8737 | 0.8389 | 0.8536 | 0.8498 | 0.6872 | 0.7823 | 0.6181 | 0.8252 | |||
2025-04-30 | Vlm(qwen) | 0.8411 | 0.7205 | 0.9398 | 0.8492 | 0.8425 | 0.7352 | 0.7833 | 0.8945 | 0.8276 | 0.8009 | |||
2024-06-26 | MoVA-8B (generalist) | 0.8341 | 0.7639 | 0.8494 | 0.8131 | 0.8752 | 0.8187 | 0.6503 | 0.7048 | 0.5172 | 0.7901 | |||
2023-06-30 | LATIN-Prompt + Claude (Zero shot) | 0.8336 | 0.6601 | 0.8553 | 0.8584 | 0.8169 | 0.8726 | 0.6021 | 0.6774 | 0.7126 | 0.8258 | |||
2024-10-09 | llama3-qwenvit | 0.8318 | 0.7377 | 0.8928 | 0.7806 | 0.8822 | 0.8009 | 0.7732 | 0.7934 | 0.5862 | 0.7531 | |||
2024-09-13 | gemma+ocr | 0.8282 | 0.6326 | 0.8395 | 0.8298 | 0.8346 | 0.8778 | 0.6417 | 0.6176 | 0.6410 | 0.8113 | |||
2023-11-27 | 36_nnrc_llama2 | 0.8239 | 0.5404 | 0.8787 | 0.7958 | 0.8475 | 0.8813 | 0.5995 | 0.7991 | 0.6897 | 0.7922 | |||
2024-01-11 | nnrc_udop_224_6ds | 0.8227 | 0.5909 | 0.8706 | 0.8352 | 0.8335 | 0.8086 | 0.5972 | 0.6835 | 0.5862 | 0.7472 | |||
2024-08-02 | loixc-onestage | 0.8221 | 0.6215 | 0.8463 | 0.8020 | 0.8593 | 0.8407 | 0.5835 | 0.6453 | 0.7241 | 0.7352 | |||
2024-07-26 | loixc-vqa | 0.8127 | 0.6182 | 0.8188 | 0.7878 | 0.8560 | 0.8496 | 0.5840 | 0.5984 | 0.3993 | 0.7445 | |||
2025-04-30 | Vis(qwen) | 0.8093 | 0.7030 | 0.9266 | 0.8286 | 0.7925 | 0.6996 | 0.8201 | 0.8781 | 0.7931 | 0.7816 | |||
2023-05-06 | Docugami-Layout | 0.8031 | 0.5176 | 0.8875 | 0.7902 | 0.8214 | 0.8026 | 0.5089 | 0.7753 | 0.4224 | 0.7022 | |||
2024-03-01 | Vary | 0.7916 | 0.7415 | 0.7949 | 0.7378 | 0.8475 | 0.8101 | 0.6671 | 0.6552 | 0.7471 | 0.7888 | |||
2025-01-10 | llama | 0.7902 | 0.6923 | 0.8167 | 0.7716 | 0.8230 | 0.7710 | 0.6595 | 0.7343 | 0.5402 | 0.7287 | |||
2022-01-07 | LayoutLMV2-large on Textract | 0.7873 | 0.4924 | 0.8771 | 0.8218 | 0.7726 | 0.7661 | 0.4820 | 0.7276 | 0.3793 | 0.6983 | |||
2023-01-29 | LayoutLMv2_star_seg | 0.7859 | 0.5328 | 0.8406 | 0.7859 | 0.8128 | 0.7909 | 0.4879 | 0.6468 | 0.3644 | 0.6953 | |||
2024-05-21 | PaliGemma-3B (finetune, 448px) | 0.7802 | 0.6290 | 0.8553 | 0.7235 | 0.8336 | 0.7410 | 0.6787 | 0.7694 | 0.8276 | 0.7123 | |||
2023-05-25 | YoBerDaV2 Single-page | 0.7749 | 0.4737 | 0.8894 | 0.7586 | 0.7962 | 0.7398 | 0.4763 | 0.7173 | 0.7586 | 0.6976 | |||
2020-05-14 | Structural LM-v2 | 0.7674 | 0.4931 | 0.8381 | 0.7621 | 0.7924 | 0.7596 | 0.4756 | 0.6282 | 0.5517 | 0.6549 | |||
2024-10-09 | llama3-intern6b | 0.7670 | 0.6419 | 0.8360 | 0.7036 | 0.8455 | 0.7184 | 0.6323 | 0.7203 | 0.5862 | 0.7121 | |||
2022-09-18 | pix2struct-large | 0.7656 | 0.4424 | 0.8827 | 0.7702 | 0.7774 | 0.7085 | 0.5383 | 0.6320 | 0.7586 | 0.6536 | |||
2022-12-28 | Submission_ErnieLayout_base_finetuned_on_DocVQA_en_train_dev_textract_word_segments_ck-14000 | 0.7599 | 0.4313 | 0.8678 | 0.7726 | 0.7641 | 0.7330 | 0.4598 | 0.6957 | 0.4828 | 0.6097 | |||
2024-09-18 | Gemma 2b + OCR | 0.7517 | 0.4797 | 0.8067 | 0.7147 | 0.7771 | 0.8311 | 0.4922 | 0.5978 | 0.4282 | 0.6948 | |||
2024-04-22 | DOLMA_multifinetuning | 0.7458 | 0.4964 | 0.8335 | 0.7234 | 0.7832 | 0.7044 | 0.4135 | 0.5815 | 0.5172 | 0.6593 | |||
2024-02-13 | instructblip | 0.7429 | 0.5158 | 0.7918 | 0.7019 | 0.7751 | 0.8088 | 0.5765 | 0.5892 | 0.5172 | 0.7062 | |||
2025-01-22 | Ivy-VL | 0.7417 | 0.5853 | 0.7874 | 0.6919 | 0.7856 | 0.7674 | 0.5753 | 0.6341 | 0.5243 | 0.7174 | |||
2025-01-22 | Ivy-VL-01 | 0.7417 | 0.5853 | 0.7874 | 0.6919 | 0.7856 | 0.7674 | 0.5753 | 0.6341 | 0.5243 | 0.7174 | |||
2020-05-15 | QA_Base_MRC_2 | 0.7415 | 0.4854 | 0.8015 | 0.6738 | 0.7943 | 0.8136 | 0.5740 | 0.5831 | 0.5287 | 0.7161 | |||
2024-07-31 | tixc-vqa | 0.7413 | 0.5732 | 0.7581 | 0.6967 | 0.7965 | 0.7738 | 0.4705 | 0.5396 | 0.5862 | 0.6927 | |||
2020-05-15 | QA_Base_MRC_1 | 0.7407 | 0.4890 | 0.7984 | 0.6675 | 0.7936 | 0.8131 | 0.5854 | 0.6099 | 0.4943 | 0.7384 | |||
2020-05-15 | QA_Base_MRC_4 | 0.7348 | 0.4735 | 0.8040 | 0.6647 | 0.7838 | 0.8043 | 0.5618 | 0.5810 | 0.4598 | 0.7332 | |||
2020-05-15 | QA_Base_MRC_3 | 0.7322 | 0.4852 | 0.7958 | 0.6562 | 0.7842 | 0.8044 | 0.5679 | 0.5730 | 0.4511 | 0.7171 | |||
2024-10-26 | 0713ap +gpt4o(no v) | 0.7309 | 0.5116 | 0.8018 | 0.7379 | 0.7305 | 0.7303 | 0.4696 | 0.6240 | 0.4598 | 0.7309 | |||
2024-01-22 | VisFocus-Base | 0.7285 | 0.3822 | 0.8695 | 0.7234 | 0.7508 | 0.6717 | 0.3656 | 0.6748 | 0.6897 | 0.5507 | |||
2020-05-15 | QA_Base_MRC_5 | 0.7274 | 0.4858 | 0.7877 | 0.6550 | 0.7754 | 0.8047 | 0.5405 | 0.5619 | 0.4598 | 0.7084 | |||
2024-05-22 | Dolma multifinetuning 7 | 0.7219 | 0.4532 | 0.8259 | 0.7036 | 0.7585 | 0.6677 | 0.4227 | 0.5740 | 0.5862 | 0.6452 | |||
2022-09-18 | pix2struct-base | 0.7213 | 0.4111 | 0.8386 | 0.7253 | 0.7503 | 0.6407 | 0.4211 | 0.5753 | 0.6552 | 0.5822 | |||
2024-10-26 | 1010ap +gpt4o(no v) | 0.7201 | 0.4800 | 0.7680 | 0.7335 | 0.7258 | 0.7333 | 0.4797 | 0.5767 | 0.6552 | 0.7111 | |||
2024-04-02 | MiniCPM-V-2 | 0.7187 | 0.6012 | 0.8062 | 0.6312 | 0.7880 | 0.6753 | 0.6834 | 0.6789 | 0.7586 | 0.6464 | |||
2023-01-27 | LayoutLM-base+GNN | 0.6984 | 0.4747 | 0.7973 | 0.6848 | 0.7322 | 0.6323 | 0.4398 | 0.5599 | 0.5431 | 0.5388 | |||
2021-12-05 | Electra Large Squad | 0.6961 | 0.4485 | 0.7703 | 0.6348 | 0.7364 | 0.7644 | 0.4594 | 0.5438 | 0.5172 | 0.6470 | |||
2023-05-25 | YoBerDaV1 Multi-page | 0.6904 | 0.3481 | 0.8335 | 0.6411 | 0.7253 | 0.6854 | 0.4191 | 0.6299 | 0.5517 | 0.6129 | |||
2020-05-16 | HyperDQA_V4 | 0.6893 | 0.3874 | 0.7792 | 0.6309 | 0.7478 | 0.7187 | 0.4867 | 0.5630 | 0.4138 | 0.5685 | |||
2020-05-16 | HyperDQA_V3 | 0.6769 | 0.3876 | 0.7774 | 0.6167 | 0.7332 | 0.6961 | 0.4296 | 0.5373 | 0.4138 | 0.5650 | |||
2023-07-06 | GPT3.5 | 0.6759 | 0.4741 | 0.7144 | 0.6524 | 0.7036 | 0.6858 | 0.5385 | 0.5038 | 0.5954 | 0.6660 | |||
2020-05-16 | HyperDQA_V2 | 0.6734 | 0.3818 | 0.7666 | 0.6110 | 0.7332 | 0.6867 | 0.4834 | 0.5560 | 0.3793 | 0.5902 | |||
2020-05-09 | HyperDQA_V1 | 0.6717 | 0.4013 | 0.7693 | 0.6197 | 0.7167 | 0.6922 | 0.3598 | 0.5596 | 0.4138 | 0.5504 | |||
2023-08-15 | LATIN-Tuning-Prompt + Alpaca (Zero-shot) | 0.6687 | 0.3732 | 0.7529 | 0.6545 | 0.6615 | 0.7463 | 0.5439 | 0.4941 | 0.3481 | 0.6831 | |||
2023-07-14 | donut_base | 0.6590 | 0.3960 | 0.8407 | 0.6604 | 0.6987 | 0.4630 | 0.2969 | 0.6964 | 0.0345 | 0.5057 | |||
2023-12-04 | ViTLP | 0.6588 | 0.3880 | 0.8220 | 0.6705 | 0.6962 | 0.4670 | 0.2973 | 0.6307 | 0.4483 | 0.4910 | |||
2023-12-21 | DocVQA: A Dataset for VQA on Document Images | 0.6566 | 0.3569 | 0.7645 | 0.5775 | 0.7000 | 0.7205 | 0.4220 | 0.4802 | 0.4483 | 0.6108 | |||
2022-09-22 | BROS_BASE (WebViCoB 6.4M) | 0.6563 | 0.3780 | 0.7757 | 0.6681 | 0.6557 | 0.6175 | 0.3497 | 0.5782 | 0.4224 | 0.5754 | |||
2023-09-24 | Layoutlm_DocVQA+Token_v2 | 0.6562 | 0.3935 | 0.7764 | 0.6228 | 0.6737 | 0.6711 | 0.3385 | 0.5109 | 0.5086 | 0.5515 | |||
2023-07-21 | donut_half_input_imageSize | 0.6536 | 0.3930 | 0.8366 | 0.6548 | 0.6950 | 0.4609 | 0.2486 | 0.6940 | 0.0345 | 0.4941 | |||
2021-12-04 | Bert Large | 0.6447 | 0.3502 | 0.7535 | 0.5488 | 0.6920 | 0.7266 | 0.4171 | 0.5254 | 0.5517 | 0.6076 | |||
2022-05-23 | Dessurt | 0.6322 | 0.3164 | 0.8058 | 0.6486 | 0.6520 | 0.4852 | 0.2862 | 0.5830 | 0.3793 | 0.4365 | |||
2024-01-09 | dolma | 0.6196 | 0.4003 | 0.7642 | 0.5805 | 0.6609 | 0.5247 | 0.3958 | 0.5596 | 0.5690 | 0.4972 | |||
2025-04-30 | Vlm(llama) | 0.5914 | 0.4167 | 0.7576 | 0.5024 | 0.6569 | 0.4938 | 0.3839 | 0.6258 | 0.5632 | 0.5198 | |||
2020-05-09 | bert fulldata fintuned | 0.5900 | 0.4169 | 0.6870 | 0.4269 | 0.6710 | 0.7315 | 0.5124 | 0.4900 | 0.4483 | 0.5907 | |||
2020-05-01 | bert finetuned | 0.5872 | 0.2986 | 0.7011 | 0.4849 | 0.6359 | 0.6933 | 0.4622 | 0.4751 | 0.4483 | 0.4895 | |||
2020-04-30 | HyperDQA_V0 | 0.5715 | 0.3131 | 0.6780 | 0.4732 | 0.6630 | 0.5716 | 0.3623 | 0.4351 | 0.3793 | 0.4941 | |||
2023-09-26 | LayoutLM_Docvqa+Token_v0 | 0.4980 | 0.2319 | 0.6035 | 0.4320 | 0.5684 | 0.4779 | 0.2768 | 0.3081 | 0.1293 | 0.4178 | |||
2022-04-27 | LayoutLMv2, Tesseract OCR eval (dataset OCR trained) | 0.4961 | 0.2544 | 0.5523 | 0.4177 | 0.5495 | 0.5914 | 0.2888 | 0.1361 | 0.2069 | 0.4187 | |||
2025-04-30 | Vis(llama) | 0.4919 | 0.2820 | 0.6123 | 0.4556 | 0.5167 | 0.4446 | 0.2262 | 0.5262 | 0.5287 | 0.4508 | |||
2022-03-29 | LayoutLMv2, Tesseract OCR eval (Tesseract OCR trained) | 0.4815 | 0.2253 | 0.5440 | 0.4216 | 0.5207 | 0.5709 | 0.2430 | 0.1353 | 0.3103 | 0.3859 | |||
2023-07-26 | donut_large_encoderSize_finetuned_20_epoch | 0.4673 | 0.2236 | 0.6691 | 0.4581 | 0.5026 | 0.2665 | 0.1356 | 0.4983 | 0.5734 | 0.3430 | |||
2020-04-27 | bert | 0.4557 | 0.2233 | 0.5259 | 0.2633 | 0.5113 | 0.7775 | 0.4859 | 0.3565 | 0.0345 | 0.5778 | |||
2020-05-16 | UGLIFT v0.1 (Clova OCR) | 0.4417 | 0.1766 | 0.5600 | 0.3178 | 0.5340 | 0.4520 | 0.2253 | 0.3573 | 0.4483 | 0.3356 | |||
2024-05-21 | PaliGemma-3B (finetune, 224px) | 0.4374 | 0.4025 | 0.4516 | 0.3236 | 0.5574 | 0.4055 | 0.3500 | 0.4077 | 0.6379 | 0.4066 | |||
2025-04-30 | HocrEN(Technique 2) - qwen7b | 0.4282 | 0.1242 | 0.3501 | 0.4594 | 0.4755 | 0.4479 | 0.2327 | 0.0400 | 0.2414 | 0.3793 | |||
2025-04-30 | HocrEN(Technique 2) - qwen14b | 0.3794 | 0.0859 | 0.3120 | 0.4120 | 0.4467 | 0.3449 | 0.1065 | 0.0450 | 0.1724 | 0.3581 | |||
2022-10-21 | Finetuning LayoutLMv3_Base | 0.3596 | 0.2102 | 0.4498 | 0.3858 | 0.3262 | 0.3496 | 0.1552 | 0.3404 | 0.0345 | 0.2706 | |||
2023-09-19 | testtest | 0.3569 | 0.3018 | 0.3407 | 0.2748 | 0.4693 | 0.3186 | 0.2682 | 0.2753 | 0.6207 | 0.3356 | |||
2020-05-14 | Plain BERT QA | 0.3524 | 0.1687 | 0.4489 | 0.2029 | 0.4321 | 0.4812 | 0.3517 | 0.3096 | 0.0345 | 0.3747 | |||
2020-05-16 | Clova OCR V0 | 0.3489 | 0.0977 | 0.4855 | 0.2670 | 0.3811 | 0.3958 | 0.2489 | 0.2875 | 0.0345 | 0.3062 | |||
2020-05-01 | HDNet | 0.3401 | 0.2040 | 0.4688 | 0.2181 | 0.4710 | 0.1916 | 0.2488 | 0.2736 | 0.1379 | 0.2458 | |||
2020-05-16 | CLOVA OCR | 0.3296 | 0.1246 | 0.4612 | 0.2455 | 0.3622 | 0.3746 | 0.1692 | 0.2736 | 0.0690 | 0.3205 | |||
2023-07-21 | donut_small_encoderSize_finetuned_20_epoch | 0.3157 | 0.1935 | 0.4417 | 0.2912 | 0.3400 | 0.2075 | 0.1495 | 0.2658 | 0.3103 | 0.2644 | |||
2020-04-29 | docVQAQV_V0.1 | 0.3016 | 0.2010 | 0.3898 | 0.3810 | 0.2933 | 0.0664 | 0.1842 | 0.2736 | 0.1586 | 0.1695 | |||
2025-04-30 | HocrEN(Technique 2) - qwen32b | 0.2931 | 0.0599 | 0.2225 | 0.2960 | 0.3636 | 0.2765 | 0.0605 | 0.0264 | 0.1379 | 0.2948 | |||
2025-05-10 | m-rope2 | 0.2676 | 0.2480 | 0.2885 | 0.2493 | 0.2836 | 0.2676 | 0.2282 | 0.2541 | 0.3793 | 0.2452 | |||
2025-04-30 | HocrEN(Technique 2) - llama | 0.2488 | 0.0747 | 0.2209 | 0.2086 | 0.3077 | 0.2775 | 0.1442 | 0.0516 | 0.2414 | 0.2046 | |||
2020-04-26 | docVQAQV_V0 | 0.2342 | 0.1646 | 0.3133 | 0.2623 | 0.2483 | 0.0549 | 0.2277 | 0.1856 | 0.1034 | 0.1635 | |||
2025-04-30 | HocrEN(Technique 2) - mistral | 0.1830 | 0.0340 | 0.2019 | 0.1499 | 0.2319 | 0.1716 | 0.0611 | 0.0219 | 0.0000 | 0.1638 | |||
2025-05-15 | gmini25 | 0.1714 | 0.1298 | 0.1693 | 0.1754 | 0.1808 | 0.1604 | 0.1736 | 0.2353 | 0.2069 | 0.1458 | |||
2025-05-15 | doubao15 | 0.1585 | 0.1141 | 0.1619 | 0.1599 | 0.1676 | 0.1460 | 0.1741 | 0.2194 | 0.2414 | 0.1430 | |||
2025-05-15 | claude37 | 0.1584 | 0.1153 | 0.1600 | 0.1594 | 0.1691 | 0.1480 | 0.1578 | 0.1868 | 0.2414 | 0.1458 | |||
2025-05-15 | gpt4o | 0.1541 | 0.1002 | 0.1563 | 0.1515 | 0.1685 | 0.1447 | 0.1811 | 0.1949 | 0.1724 | 0.1353 | |||
2025-05-15 | wenxin45 | 0.1477 | 0.1048 | 0.1494 | 0.1454 | 0.1655 | 0.1280 | 0.1663 | 0.1966 | 0.1034 | 0.1093 | |||
2021-02-08 | seq2seq | 0.1081 | 0.0758 | 0.1283 | 0.0829 | 0.1332 | 0.0822 | 0.0786 | 0.0779 | 0.4828 | 0.1052 | |||
2024-01-23 | lixiang-vlm-7b-handled | 0.0990 | 0.0478 | 0.0798 | 0.0348 | 0.1648 | 0.0863 | 0.1309 | 0.1395 | 0.5517 | 0.1191 | |||
2024-01-24 | lixiang-vlm-7b | 0.0631 | 0.0313 | 0.0693 | 0.0272 | 0.0894 | 0.0639 | 0.0122 | 0.1145 | 0.5517 | 0.0826 | |||
2024-01-21 | lixiang-vlm handled | 0.0536 | 0.0243 | 0.0272 | 0.0097 | 0.1084 | 0.0400 | 0.0605 | 0.0395 | 0.1034 | 0.0568 | |||
2024-01-21 | lixiang-vlm | 0.0264 | 0.0176 | 0.0123 | 0.0045 | 0.0502 | 0.0262 | 0.0078 | 0.0291 | 0.1034 | 0.0273 | |||
2020-06-16 | Test Submission | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |||
2024-09-11 | zs | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |