Ranking Table

Description Paper Source Code
Answer typeEvidenceOperation
DateMethodScoreImage spanQuestion spanMultiple spansNon spanTable/ListTextualVisual objectFigureMapComparisonArithmeticCounting
2024-12-24InternVL2.5-78B-MPO (generalist)0.84280.87650.87530.69770.73570.83130.92470.83980.82290.73380.72800.88120.5865
2024-06-30InternVL2-Pro (generalist)0.83340.86810.89290.73500.69690.83350.92600.77570.80930.71860.73010.85840.5368
2024-09-25Molmo-72B0.81860.85130.88270.68210.70410.81840.91360.80620.79450.69600.70540.81880.5930
2025-01-10VideoLLaMA3-7B0.78930.82690.83580.68450.64470.79360.91650.74460.74990.66610.64110.77850.5179
2024-12-13DeepSeek-VL20.78140.81890.80100.69890.63630.79350.90410.73710.74340.63270.62060.72820.5326
2024-04-27InternVL-1.5-Plus (generalist)0.75740.79890.81240.64250.59870.75440.87330.73060.72340.62160.60650.73860.4623
2024-11-01MLCD-Embodied-7B: Multi-label Cluster Discrimination for Visual Representation Learning0.69980.73300.79300.59550.55640.69510.82710.66540.66140.54950.55230.63500.4905
2024-04-02InternLM-XComposer2-4KHD-7B0.68550.73360.75700.51510.51240.66430.82400.65980.64710.52410.51200.66360.3610
2024-05-21PaliGemma-3B (finetune, 896px)0.47750.52140.53720.33010.32200.45000.60570.42520.43770.36900.37420.39240.2507
2024-05-21PaliGemma-3B (finetune, 448px)0.40470.42750.58010.25600.30070.40100.48530.38980.37420.31780.35300.33600.2517
2024-05-21PaliGemma-3B (finetune, 224px)0.28460.28880.50240.15670.24250.26750.32060.31640.26090.24060.29790.20250.2730

Ranking Graphic