R.R.C. Robust Reading Competition
  • Home (current)
  • Challenges
    • MapText2025
    • Comics Understanding2025
    • NoTeS2025
    • Occluded RoadText2024
    • MapText2024
    • HR-Ciphers2024
    • DocVQA2020-23
    • ReST2023
    • SVRD2023
    • DSText2023
    • DUDE 😎2023
    • NewsVideoQA2023
    • RoadText2023
    • DocILE2023
    • HierText2022
    • Out of Vocabulary2022
    • ST-VQA2019
    • MLT2019
    • LSVT2019
    • ArT2019
    • SROIE2019
    • ReCTS2019
    • COCO-Text2017
    • DeTEXT2017
    • DOST2017
    • FSNS2017
    • MLT2017
    • IEHHR2017
    • Incidental Scene Text2015
    • Text in Videos2013-2015
    • Focused Scene Text2013-2015
    • Born-Digital Images (Web and Email)2011-2015
  • Register
    DocVQA 2020-23
  • Overview
  • Tasks
  • Downloads
  • Results
  • My Methods
  • Organizers
  • Home
  • DocVQA
  • Results
  • Task 3 - Infographics VQA
  • Method: InternVL2.5-78B-MPO (generalist)
  • Task 3 - Infographics VQA - Method: InternVL2.5-78B-MPO (generalist)
  • Method info
  • Samples list
  • Per sample details

method: InternVL2.5-78B-MPO (generalist)2024-12-24

Authors: InternVL team

Affiliation: Shanghai AI Laboratory & Tsinghua University

Email: wangweiyun@pjlab.org.cn

Description: InternVL2.5-MPO: Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

@article{wang2024mpo, title={Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization}, author={Wang, Weiyun and Chen, Zhe and Wang, Wenhai and Cao, Yue and Liu, Yangzhou and Gao, Zhangwei and Zhu, Jinguo and Zhu, Xizhou and Lu, Lewei and Qiao, Yu and Dai, Jifeng}, journal={arXiv preprint arXiv:2411.10442}, year={2024} }

@article{chen2024expanding, title={Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling}, author={Chen, Zhe and Wang, Weiyun and Cao, Yue and Liu, Yangzhou and Gao, Zhangwei and Cui, Erfei and Zhu, Jinguo and Ye, Shenglong and Tian, Hao and Liu, Zhaoyang and others}, journal={arXiv preprint arXiv:2412.05271}, year={2024} }

Source code