- Task 1 - Single Page Document VQA
- Task 2 - Document Collection VQA
- Task 3 - Infographics VQA
- Task 4 - MP-DocVQA
method: Human Performance2020-06-13
Authors: DocVQA Organizers
Affiliation: CVIT, IIIT Hyderabad, CVC-UAB, Amazon
Description: Human performance on the test set.
A small group of volunteers were asked to enter an answer for the given question and the image.
method: qwen2-vl2024-07-11
Authors: qwen team
Affiliation: alibaba group
Description: qwen2-vl
method: InternVL2-Pro (generalist)2024-06-30
Authors: InternVL team
Affiliation: Shanghai AI Laboratory & Sensetime & Tsinghua University
Email: czcz94cz@gmail.com
Description: InternVL Family: Closing the Gap to Commercial Multimodal Models with Open-Source Suites —— A Pioneering Open-Source Alternative to GPT-4V
Demo: https://internvl.opengvlab.com/
Code: https://github.com/OpenGVLab/InternVL
Model: https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5
Description Paper Source Code
Date | Method | Score | Figure/Diagram | Form | Table/List | Layout | Free_text | Image/Photo | Handwritten | Yes/No | Others | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2020-06-13 | Human Performance | 0.9811 | 0.9756 | 0.9825 | 0.9780 | 0.9845 | 0.9839 | 0.9740 | 0.9717 | 0.9974 | 0.9828 | |||
2024-07-11 | qwen2-vl | 0.9670 | 0.9206 | 0.9816 | 0.9703 | 0.9678 | 0.9619 | 0.9135 | 0.9436 | 0.9655 | 0.9540 | |||
2024-06-30 | InternVL2-Pro (generalist) | 0.9506 | 0.8888 | 0.9714 | 0.9486 | 0.9582 | 0.9446 | 0.8909 | 0.9278 | 0.9655 | 0.9410 | |||
2024-09-25 | Molmo-72B | 0.9351 | 0.8822 | 0.9548 | 0.9387 | 0.9413 | 0.9100 | 0.8688 | 0.9196 | 0.9195 | 0.9229 | |||
2024-01-24 | qwenvl-max (single generalist model) | 0.9307 | 0.8491 | 0.9474 | 0.9195 | 0.9403 | 0.9380 | 0.8652 | 0.8922 | 0.8621 | 0.9341 | |||
2024-05-10 | Vary (using multi crop) | 0.9241 | 0.8926 | 0.9372 | 0.8953 | 0.9405 | 0.9447 | 0.9035 | 0.9335 | 0.8739 | 0.9478 | |||
2024-04-27 | InternVL-1.5-Plus (generalist) | 0.9234 | 0.8354 | 0.9556 | 0.9123 | 0.9397 | 0.9032 | 0.8313 | 0.9064 | 0.9655 | 0.9098 | |||
2024-11-01 | MLCD-Embodied-7B: Multi-label Cluster Discrimination for Visual Representation Learning | 0.9158 | 0.8286 | 0.9315 | 0.9131 | 0.9289 | 0.9088 | 0.7804 | 0.8300 | 0.8897 | 0.8796 | |||
2023-12-07 | qwenvl-plus (single generalist model) | 0.9141 | 0.8146 | 0.9464 | 0.8999 | 0.9277 | 0.9265 | 0.8419 | 0.8776 | 0.9310 | 0.8667 | |||
2023-11-15 | SMoLA-PaLI-X Specialist Model | 0.9084 | 0.7790 | 0.9416 | 0.8934 | 0.9262 | 0.9188 | 0.7911 | 0.8508 | 0.8966 | 0.8456 | |||
2023-12-07 | SMoLA-PaLI-X Generalist Model | 0.9055 | 0.7757 | 0.9381 | 0.8924 | 0.9187 | 0.9179 | 0.8364 | 0.8483 | 0.7446 | 0.8609 | |||
2024-05-01 | Snowflake Arctic-TILT 0.8B (fine-tuned) | 0.9020 | 0.7198 | 0.9398 | 0.9152 | 0.9015 | 0.9042 | 0.6860 | 0.8415 | 0.6897 | 0.8604 | |||
2022-10-08 | BAIDU-DI | 0.9016 | 0.6823 | 0.9186 | 0.9139 | 0.9138 | 0.9234 | 0.6841 | 0.7949 | 0.6181 | 0.8344 | |||
2024-04-02 | InternLM-XComposer2-4KHD-7B | 0.9002 | 0.8041 | 0.9400 | 0.8965 | 0.9143 | 0.8618 | 0.7845 | 0.8264 | 0.8621 | 0.8298 | |||
2024-02-10 | ScreenAI 5B | 0.8988 | 0.7297 | 0.9419 | 0.8928 | 0.9158 | 0.8873 | 0.7722 | 0.8160 | 0.8966 | 0.8551 | |||
2024-05-01 | Snowflake Arctic-TILT 0.8B (zero-shot) | 0.8881 | 0.6826 | 0.9311 | 0.9011 | 0.8867 | 0.8917 | 0.6534 | 0.8219 | 0.6897 | 0.8515 | |||
2022-03-31 | Tencent Youtu | 0.8866 | 0.7576 | 0.9470 | 0.8932 | 0.8821 | 0.8654 | 0.6680 | 0.8877 | 0.4828 | 0.8413 | |||
2022-01-13 | ERNIE-Layout 2.0 | 0.8841 | 0.6434 | 0.9177 | 0.8996 | 0.8899 | 0.9010 | 0.6223 | 0.7836 | 0.6124 | 0.8118 | |||
2023-12-10 | DocFormerv2 (Single Model with 750M Parameters) | 0.8784 | 0.6680 | 0.9382 | 0.9076 | 0.8676 | 0.8555 | 0.5840 | 0.8123 | 0.8276 | 0.8070 | |||
2024-10-30 | BlueLM-V-3B | 0.8775 | 0.7652 | 0.9245 | 0.8659 | 0.9005 | 0.8372 | 0.8079 | 0.8276 | 0.7931 | 0.7734 | |||
2024-09-08 | neetolab-sota-v1 | 0.8759 | 0.7938 | 0.9209 | 0.8577 | 0.8946 | 0.8558 | 0.8011 | 0.8664 | 0.6207 | 0.8261 | |||
2021-11-26 | Mybank-DocReader | 0.8755 | 0.6682 | 0.9233 | 0.8763 | 0.8896 | 0.8713 | 0.6290 | 0.8047 | 0.5805 | 0.7804 | |||
2021-09-06 | ERNIE-Layout 1.0 | 0.8753 | 0.6586 | 0.8972 | 0.8864 | 0.8902 | 0.8943 | 0.6392 | 0.7331 | 0.5434 | 0.8115 | |||
2024-08-22 | Mini-Monkey | 0.8738 | 0.7334 | 0.9350 | 0.8493 | 0.9046 | 0.8383 | 0.7931 | 0.8262 | 0.6782 | 0.7628 | |||
2024-05-31 | GPT-4 Vision Turbo + Amazon Textract OCR | 0.8736 | 0.7346 | 0.9196 | 0.8756 | 0.8678 | 0.8709 | 0.8137 | 0.8681 | 0.8966 | 0.8464 | |||
2021-02-12 | Applica.ai TILT | 0.8705 | 0.6082 | 0.9459 | 0.8980 | 0.8592 | 0.8581 | 0.5508 | 0.8139 | 0.6897 | 0.7788 | |||
2023-05-31 | PaLI-X (Google Research; Single Generative Model) | 0.8679 | 0.6971 | 0.8992 | 0.8400 | 0.8955 | 0.8925 | 0.7589 | 0.7209 | 0.8966 | 0.8468 | |||
2020-12-22 | LayoutLM 2.0 (single model) | 0.8672 | 0.6574 | 0.8953 | 0.8769 | 0.8791 | 0.8707 | 0.7287 | 0.6729 | 0.5517 | 0.8103 | |||
2024-01-24 | nnrc_vary | 0.8631 | 0.6689 | 0.9174 | 0.8354 | 0.8876 | 0.8761 | 0.6891 | 0.8269 | 0.6207 | 0.7696 | |||
2023-12-10 | 54_nnrc_zephyr | 0.8560 | 0.6170 | 0.8924 | 0.8603 | 0.8546 | 0.9020 | 0.6083 | 0.8142 | 0.7488 | 0.8386 | |||
2020-08-16 | Alibaba DAMO NLP | 0.8506 | 0.6650 | 0.8809 | 0.8552 | 0.8733 | 0.8397 | 0.6758 | 0.7691 | 0.5492 | 0.7526 | |||
2020-05-16 | PingAn-OneConnect-Gammalab-DQA | 0.8484 | 0.6059 | 0.9021 | 0.8463 | 0.8730 | 0.8337 | 0.5812 | 0.7692 | 0.5172 | 0.7289 | |||
2024-05-01 | PaliGemma-3B (finetune, 896px) | 0.8477 | 0.6543 | 0.9252 | 0.8326 | 0.8733 | 0.8099 | 0.7382 | 0.8314 | 0.7931 | 0.7571 | |||
2024-01-21 | Spatial LLM v1.2 | 0.8443 | 0.6300 | 0.8917 | 0.8180 | 0.8644 | 0.8877 | 0.6106 | 0.7390 | 0.6897 | 0.8097 | |||
2023-02-21 | LayoutLMv2_star_seg_large | 0.8430 | 0.7008 | 0.8737 | 0.8389 | 0.8536 | 0.8498 | 0.6872 | 0.7823 | 0.6181 | 0.8252 | |||
2024-06-26 | MoVA-8B (generalist) | 0.8341 | 0.7639 | 0.8494 | 0.8131 | 0.8752 | 0.8187 | 0.6503 | 0.7048 | 0.5172 | 0.7901 | |||
2023-06-30 | LATIN-Prompt + Claude (Zero shot) | 0.8336 | 0.6601 | 0.8553 | 0.8584 | 0.8169 | 0.8726 | 0.6021 | 0.6774 | 0.7126 | 0.8258 | |||
2024-10-09 | llama3-qwenvit | 0.8318 | 0.7377 | 0.8928 | 0.7806 | 0.8822 | 0.8009 | 0.7732 | 0.7934 | 0.5862 | 0.7531 | |||
2024-09-13 | gemma+ocr | 0.8282 | 0.6326 | 0.8395 | 0.8298 | 0.8346 | 0.8778 | 0.6417 | 0.6176 | 0.6410 | 0.8113 | |||
2023-12-01 | nnrc mplugowl2_9k | 0.8281 | 0.5780 | 0.8949 | 0.7860 | 0.8662 | 0.8631 | 0.6302 | 0.8054 | 0.5517 | 0.7867 | |||
2023-11-27 | 36_nnrc_llama2 | 0.8239 | 0.5404 | 0.8787 | 0.7958 | 0.8475 | 0.8813 | 0.5995 | 0.7991 | 0.6897 | 0.7922 | |||
2024-01-11 | nnrc_udop_224_6ds | 0.8227 | 0.5909 | 0.8706 | 0.8352 | 0.8335 | 0.8086 | 0.5972 | 0.6835 | 0.5862 | 0.7472 | |||
2024-08-02 | loixc-onestage | 0.8221 | 0.6215 | 0.8463 | 0.8020 | 0.8593 | 0.8407 | 0.5835 | 0.6453 | 0.7241 | 0.7352 | |||
2024-07-26 | loixc-vqa | 0.8127 | 0.6182 | 0.8188 | 0.7878 | 0.8560 | 0.8496 | 0.5840 | 0.5984 | 0.3993 | 0.7445 | |||
2023-05-06 | Docugami-Layout | 0.8031 | 0.5176 | 0.8875 | 0.7902 | 0.8214 | 0.8026 | 0.5089 | 0.7753 | 0.4224 | 0.7022 | |||
2024-03-01 | Vary | 0.7916 | 0.7415 | 0.7949 | 0.7378 | 0.8475 | 0.8101 | 0.6671 | 0.6552 | 0.7471 | 0.7888 | |||
2022-01-07 | LayoutLMV2-large on Textract | 0.7873 | 0.4924 | 0.8771 | 0.8218 | 0.7726 | 0.7661 | 0.4820 | 0.7276 | 0.3793 | 0.6983 | |||
2023-01-29 | LayoutLMv2_star_seg | 0.7859 | 0.5328 | 0.8406 | 0.7859 | 0.8128 | 0.7909 | 0.4879 | 0.6468 | 0.3644 | 0.6953 | |||
2024-05-21 | PaliGemma-3B (finetune, 448px) | 0.7802 | 0.6290 | 0.8553 | 0.7235 | 0.8336 | 0.7410 | 0.6787 | 0.7694 | 0.8276 | 0.7123 | |||
2023-05-25 | YoBerDaV2 Single-page | 0.7749 | 0.4737 | 0.8894 | 0.7586 | 0.7962 | 0.7398 | 0.4763 | 0.7173 | 0.7586 | 0.6976 | |||
2020-05-14 | Structural LM-v2 | 0.7674 | 0.4931 | 0.8381 | 0.7621 | 0.7924 | 0.7596 | 0.4756 | 0.6282 | 0.5517 | 0.6549 | |||
2024-10-09 | llama3-intern6b | 0.7670 | 0.6419 | 0.8360 | 0.7036 | 0.8455 | 0.7184 | 0.6323 | 0.7203 | 0.5862 | 0.7121 | |||
2022-09-18 | pix2struct-large | 0.7656 | 0.4424 | 0.8827 | 0.7702 | 0.7774 | 0.7085 | 0.5383 | 0.6320 | 0.7586 | 0.6536 | |||
2022-12-28 | Submission_ErnieLayout_base_finetuned_on_DocVQA_en_train_dev_textract_word_segments_ck-14000 | 0.7599 | 0.4313 | 0.8678 | 0.7726 | 0.7641 | 0.7330 | 0.4598 | 0.6957 | 0.4828 | 0.6097 | |||
2024-09-18 | Gemma 2b + OCR | 0.7517 | 0.4797 | 0.8067 | 0.7147 | 0.7771 | 0.8311 | 0.4922 | 0.5978 | 0.4282 | 0.6948 | |||
2024-04-22 | DOLMA_multifinetuning | 0.7458 | 0.4964 | 0.8335 | 0.7234 | 0.7832 | 0.7044 | 0.4135 | 0.5815 | 0.5172 | 0.6593 | |||
2024-02-13 | instructblip | 0.7429 | 0.5158 | 0.7918 | 0.7019 | 0.7751 | 0.8088 | 0.5765 | 0.5892 | 0.5172 | 0.7062 | |||
2020-05-15 | QA_Base_MRC_2 | 0.7415 | 0.4854 | 0.8015 | 0.6738 | 0.7943 | 0.8136 | 0.5740 | 0.5831 | 0.5287 | 0.7161 | |||
2024-07-31 | tixc-vqa | 0.7413 | 0.5732 | 0.7581 | 0.6967 | 0.7965 | 0.7738 | 0.4705 | 0.5396 | 0.5862 | 0.6927 | |||
2020-05-15 | QA_Base_MRC_1 | 0.7407 | 0.4890 | 0.7984 | 0.6675 | 0.7936 | 0.8131 | 0.5854 | 0.6099 | 0.4943 | 0.7384 | |||
2020-05-15 | QA_Base_MRC_4 | 0.7348 | 0.4735 | 0.8040 | 0.6647 | 0.7838 | 0.8043 | 0.5618 | 0.5810 | 0.4598 | 0.7332 | |||
2020-05-15 | QA_Base_MRC_3 | 0.7322 | 0.4852 | 0.7958 | 0.6562 | 0.7842 | 0.8044 | 0.5679 | 0.5730 | 0.4511 | 0.7171 | |||
2024-10-26 | 0713ap +gpt4o(no v) | 0.7309 | 0.5116 | 0.8018 | 0.7379 | 0.7305 | 0.7303 | 0.4696 | 0.6240 | 0.4598 | 0.7309 | |||
2024-01-22 | VisFocus-Base | 0.7285 | 0.3822 | 0.8695 | 0.7234 | 0.7508 | 0.6717 | 0.3656 | 0.6748 | 0.6897 | 0.5507 | |||
2020-05-15 | QA_Base_MRC_5 | 0.7274 | 0.4858 | 0.7877 | 0.6550 | 0.7754 | 0.8047 | 0.5405 | 0.5619 | 0.4598 | 0.7084 | |||
2024-05-22 | Dolma multifinetuning 7 | 0.7219 | 0.4532 | 0.8259 | 0.7036 | 0.7585 | 0.6677 | 0.4227 | 0.5740 | 0.5862 | 0.6452 | |||
2022-09-18 | pix2struct-base | 0.7213 | 0.4111 | 0.8386 | 0.7253 | 0.7503 | 0.6407 | 0.4211 | 0.5753 | 0.6552 | 0.5822 | |||
2024-10-26 | 1010ap +gpt4o(no v) | 0.7201 | 0.4800 | 0.7680 | 0.7335 | 0.7258 | 0.7333 | 0.4797 | 0.5767 | 0.6552 | 0.7111 | |||
2024-04-02 | MiniCPM-V-2 | 0.7187 | 0.6012 | 0.8062 | 0.6312 | 0.7880 | 0.6753 | 0.6834 | 0.6789 | 0.7586 | 0.6464 | |||
2023-01-27 | LayoutLM-base+GNN | 0.6984 | 0.4747 | 0.7973 | 0.6848 | 0.7322 | 0.6323 | 0.4398 | 0.5599 | 0.5431 | 0.5388 | |||
2021-12-05 | Electra Large Squad | 0.6961 | 0.4485 | 0.7703 | 0.6348 | 0.7364 | 0.7644 | 0.4594 | 0.5438 | 0.5172 | 0.6470 | |||
2023-05-25 | YoBerDaV1 Multi-page | 0.6904 | 0.3481 | 0.8335 | 0.6411 | 0.7253 | 0.6854 | 0.4191 | 0.6299 | 0.5517 | 0.6129 | |||
2020-05-16 | HyperDQA_V4 | 0.6893 | 0.3874 | 0.7792 | 0.6309 | 0.7478 | 0.7187 | 0.4867 | 0.5630 | 0.4138 | 0.5685 | |||
2020-05-16 | HyperDQA_V3 | 0.6769 | 0.3876 | 0.7774 | 0.6167 | 0.7332 | 0.6961 | 0.4296 | 0.5373 | 0.4138 | 0.5650 | |||
2023-07-06 | GPT3.5 | 0.6759 | 0.4741 | 0.7144 | 0.6524 | 0.7036 | 0.6858 | 0.5385 | 0.5038 | 0.5954 | 0.6660 | |||
2020-05-16 | HyperDQA_V2 | 0.6734 | 0.3818 | 0.7666 | 0.6110 | 0.7332 | 0.6867 | 0.4834 | 0.5560 | 0.3793 | 0.5902 | |||
2020-05-09 | HyperDQA_V1 | 0.6717 | 0.4013 | 0.7693 | 0.6197 | 0.7167 | 0.6922 | 0.3598 | 0.5596 | 0.4138 | 0.5504 | |||
2023-08-15 | LATIN-Tuning-Prompt + Alpaca (Zero-shot) | 0.6687 | 0.3732 | 0.7529 | 0.6545 | 0.6615 | 0.7463 | 0.5439 | 0.4941 | 0.3481 | 0.6831 | |||
2023-07-14 | donut_base | 0.6590 | 0.3960 | 0.8407 | 0.6604 | 0.6987 | 0.4630 | 0.2969 | 0.6964 | 0.0345 | 0.5057 | |||
2023-12-04 | ViTLP | 0.6588 | 0.3880 | 0.8220 | 0.6705 | 0.6962 | 0.4670 | 0.2973 | 0.6307 | 0.4483 | 0.4910 | |||
2023-12-21 | DocVQA: A Dataset for VQA on Document Images | 0.6566 | 0.3569 | 0.7645 | 0.5775 | 0.7000 | 0.7205 | 0.4220 | 0.4802 | 0.4483 | 0.6108 | |||
2022-09-22 | BROS_BASE (WebViCoB 6.4M) | 0.6563 | 0.3780 | 0.7757 | 0.6681 | 0.6557 | 0.6175 | 0.3497 | 0.5782 | 0.4224 | 0.5754 | |||
2023-09-24 | Layoutlm_DocVQA+Token_v2 | 0.6562 | 0.3935 | 0.7764 | 0.6228 | 0.6737 | 0.6711 | 0.3385 | 0.5109 | 0.5086 | 0.5515 | |||
2023-07-21 | donut_half_input_imageSize | 0.6536 | 0.3930 | 0.8366 | 0.6548 | 0.6950 | 0.4609 | 0.2486 | 0.6940 | 0.0345 | 0.4941 | |||
2021-12-04 | Bert Large | 0.6447 | 0.3502 | 0.7535 | 0.5488 | 0.6920 | 0.7266 | 0.4171 | 0.5254 | 0.5517 | 0.6076 | |||
2022-05-23 | Dessurt | 0.6322 | 0.3164 | 0.8058 | 0.6486 | 0.6520 | 0.4852 | 0.2862 | 0.5830 | 0.3793 | 0.4365 | |||
2024-01-09 | dolma | 0.6196 | 0.4003 | 0.7642 | 0.5805 | 0.6609 | 0.5247 | 0.3958 | 0.5596 | 0.5690 | 0.4972 | |||
2020-05-09 | bert fulldata fintuned | 0.5900 | 0.4169 | 0.6870 | 0.4269 | 0.6710 | 0.7315 | 0.5124 | 0.4900 | 0.4483 | 0.5907 | |||
2020-05-01 | bert finetuned | 0.5872 | 0.2986 | 0.7011 | 0.4849 | 0.6359 | 0.6933 | 0.4622 | 0.4751 | 0.4483 | 0.4895 | |||
2020-04-30 | HyperDQA_V0 | 0.5715 | 0.3131 | 0.6780 | 0.4732 | 0.6630 | 0.5716 | 0.3623 | 0.4351 | 0.3793 | 0.4941 | |||
2023-09-26 | LayoutLM_Docvqa+Token_v0 | 0.4980 | 0.2319 | 0.6035 | 0.4320 | 0.5684 | 0.4779 | 0.2768 | 0.3081 | 0.1293 | 0.4178 | |||
2022-04-27 | LayoutLMv2, Tesseract OCR eval (dataset OCR trained) | 0.4961 | 0.2544 | 0.5523 | 0.4177 | 0.5495 | 0.5914 | 0.2888 | 0.1361 | 0.2069 | 0.4187 | |||
2022-03-29 | LayoutLMv2, Tesseract OCR eval (Tesseract OCR trained) | 0.4815 | 0.2253 | 0.5440 | 0.4216 | 0.5207 | 0.5709 | 0.2430 | 0.1353 | 0.3103 | 0.3859 | |||
2023-07-26 | donut_large_encoderSize_finetuned_20_epoch | 0.4673 | 0.2236 | 0.6691 | 0.4581 | 0.5026 | 0.2665 | 0.1356 | 0.4983 | 0.5734 | 0.3430 | |||
2020-04-27 | bert | 0.4557 | 0.2233 | 0.5259 | 0.2633 | 0.5113 | 0.7775 | 0.4859 | 0.3565 | 0.0345 | 0.5778 | |||
2020-05-16 | UGLIFT v0.1 (Clova OCR) | 0.4417 | 0.1766 | 0.5600 | 0.3178 | 0.5340 | 0.4520 | 0.2253 | 0.3573 | 0.4483 | 0.3356 | |||
2024-05-21 | PaliGemma-3B (finetune, 224px) | 0.4374 | 0.4025 | 0.4516 | 0.3236 | 0.5574 | 0.4055 | 0.3500 | 0.4077 | 0.6379 | 0.4066 | |||
2022-10-21 | Finetuning LayoutLMv3_Base | 0.3596 | 0.2102 | 0.4498 | 0.3858 | 0.3262 | 0.3496 | 0.1552 | 0.3404 | 0.0345 | 0.2706 | |||
2023-09-19 | testtest | 0.3569 | 0.3018 | 0.3407 | 0.2748 | 0.4693 | 0.3186 | 0.2682 | 0.2753 | 0.6207 | 0.3356 | |||
2020-05-14 | Plain BERT QA | 0.3524 | 0.1687 | 0.4489 | 0.2029 | 0.4321 | 0.4812 | 0.3517 | 0.3096 | 0.0345 | 0.3747 | |||
2020-05-16 | Clova OCR V0 | 0.3489 | 0.0977 | 0.4855 | 0.2670 | 0.3811 | 0.3958 | 0.2489 | 0.2875 | 0.0345 | 0.3062 | |||
2020-05-01 | HDNet | 0.3401 | 0.2040 | 0.4688 | 0.2181 | 0.4710 | 0.1916 | 0.2488 | 0.2736 | 0.1379 | 0.2458 | |||
2020-05-16 | CLOVA OCR | 0.3296 | 0.1246 | 0.4612 | 0.2455 | 0.3622 | 0.3746 | 0.1692 | 0.2736 | 0.0690 | 0.3205 | |||
2023-07-21 | donut_small_encoderSize_finetuned_20_epoch | 0.3157 | 0.1935 | 0.4417 | 0.2912 | 0.3400 | 0.2075 | 0.1495 | 0.2658 | 0.3103 | 0.2644 | |||
2020-04-29 | docVQAQV_V0.1 | 0.3016 | 0.2010 | 0.3898 | 0.3810 | 0.2933 | 0.0664 | 0.1842 | 0.2736 | 0.1586 | 0.1695 | |||
2020-04-26 | docVQAQV_V0 | 0.2342 | 0.1646 | 0.3133 | 0.2623 | 0.2483 | 0.0549 | 0.2277 | 0.1856 | 0.1034 | 0.1635 | |||
2021-02-08 | seq2seq | 0.1081 | 0.0758 | 0.1283 | 0.0829 | 0.1332 | 0.0822 | 0.0786 | 0.0779 | 0.4828 | 0.1052 | |||
2024-01-23 | lixiang-vlm-7b-handled | 0.0990 | 0.0478 | 0.0798 | 0.0348 | 0.1648 | 0.0863 | 0.1309 | 0.1395 | 0.5517 | 0.1191 | |||
2024-01-24 | lixiang-vlm-7b | 0.0631 | 0.0313 | 0.0693 | 0.0272 | 0.0894 | 0.0639 | 0.0122 | 0.1145 | 0.5517 | 0.0826 | |||
2024-01-21 | lixiang-vlm handled | 0.0536 | 0.0243 | 0.0272 | 0.0097 | 0.1084 | 0.0400 | 0.0605 | 0.0395 | 0.1034 | 0.0568 | |||
2024-01-21 | lixiang-vlm | 0.0264 | 0.0176 | 0.0123 | 0.0045 | 0.0502 | 0.0262 | 0.0078 | 0.0291 | 0.1034 | 0.0273 | |||
2020-06-16 | Test Submission | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |||
2024-09-11 | zs | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |