R.R.C. Robust Reading Competition
  • Home (current)
  • Challenges
    • MapText2025
    • Comics Understanding2025
    • NoTeS2025
    • Occluded RoadText2024
    • MapText2024
    • HR-Ciphers2024
    • DocVQA2020-23
    • ReST2023
    • SVRD2023
    • DSText2023
    • DUDE 😎2023
    • NewsVideoQA2023
    • RoadText2023
    • DocILE2023
    • HierText2022
    • Out of Vocabulary2022
    • ST-VQA2019
    • MLT2019
    • LSVT2019
    • ArT2019
    • SROIE2019
    • ReCTS2019
    • COCO-Text2017
    • DeTEXT2017
    • DOST2017
    • FSNS2017
    • MLT2017
    • IEHHR2017
    • Incidental Scene Text2015
    • Text in Videos2013-2015
    • Focused Scene Text2013-2015
    • Born-Digital Images (Web and Email)2011-2015
  • Register
    DocVQA 2020-23
  • Overview
  • Tasks
  • Downloads
  • Results
  • My Methods
  • Organizers
  • Home
  • DocVQA
  • Results
  • Task 3 - Infographics VQA
  • Method: InternVL2-Pro (generalist)
  • Task 3 - Infographics VQA - Method: InternVL2-Pro (generalist)
  • Method info
  • Samples list
  • Per sample details

method: InternVL2-Pro (generalist)2024-06-30

Authors: InternVL team

Affiliation: Shanghai AI Laboratory & Sensetime & Tsinghua University

Email: czcz94cz@gmail.com

Description: InternVL Family: Closing the Gap to Commercial Multimodal Models with Open-Source Suites —— A Pioneering Open-Source Alternative to GPT-4V

Demo: https://internvl.opengvlab.com/
Code: https://github.com/OpenGVLab/InternVL
Model: https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5

@article{chen2023internvl, title={Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks}, author={Chen, Zhe and Wu, Jiannan and Wang, Wenhai and Su, Weijie and Chen, Guo and Xing, Sen and Muyan, Zhong and Zhang, Qinglong and Zhu, Xizhou and Lu, Lewei and others}, journal={arXiv preprint arXiv:2312.14238}, year={2023} }

@article{chen2024far, title={How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites}, author={Chen, Zhe and Wang, Weiyun and Tian, Hao and Ye, Shenglong and Gao, Zhangwei and Cui, Erfei and Tong, Wenwen and Hu, Kongzhi and Luo, Jiapeng and Ma, Zheng and others}, journal={arXiv preprint arXiv:2404.16821}, year={2024} }

Source code