method: qwen2-vl2024-07-12

Authors: qwen team

Affiliation: alibaba group

Description: qwen2-vl

Ranking Table

Description Paper Source Code
Answer typeEvidenceOperation
DateMethodScoreImage spanQuestion spanMultiple spansNon spanTable/ListTextualVisual objectFigureMapComparisonArithmeticCounting
2022-03-02Human Performance0.97180.97450.97770.93350.97160.97800.97890.97700.96990.94330.97120.98370.9544
2024-07-12qwen2-vl0.84690.87390.87080.77780.74240.85960.94300.78270.81700.75920.72950.89770.5793
2024-06-30InternVL2-Pro (generalist)0.83340.86810.89290.73500.69690.83350.92600.77570.80930.71860.73010.85840.5368
2024-09-25Molmo-72B0.81860.85130.88270.68210.70410.81840.91360.80620.79450.69600.70540.81880.5930
2024-11-20Eagle-2-9B0.78050.81580.82120.63860.65720.77360.90190.73950.74500.63170.62030.80620.5055
2024-04-27InternVL-1.5-Plus (generalist)0.75740.79890.81240.64250.59870.75440.87330.73060.72340.62160.60650.73860.4623
2024-01-24qwenvl-max (single generalist model)0.73410.77560.80830.60350.57170.72910.88560.67080.68920.59670.60090.71520.4388
2024-05-31GPT-4 Vision Turbo + Amazon Textract OCR0.71910.75750.77950.65910.55530.71830.82010.66960.69040.69260.58150.67590.4281
2023-07-05RALLM0.71750.74210.78840.08300.80310.68660.70880.73760.72140.80490.71410.80380.7916
2024-11-01MLCD-Embodied-7B: Multi-label Cluster Discrimination for Visual Representation Learning0.69980.73300.79300.59550.55640.69510.82710.66540.66140.54950.55230.63500.4905
2024-04-02InternLM-XComposer2-4KHD-7B0.68550.73360.75700.51510.51240.66430.82400.65980.64710.52410.51200.66360.3610
2023-11-15SMoLA-PaLI-X Specialist Model0.66210.71660.72520.58380.42920.64480.82610.67140.61100.50650.52380.50540.3506
2024-02-10ScreenAI 5B0.65900.71620.72470.57340.41400.65250.83150.59680.60200.44670.48150.53030.3000
2023-12-07SMoLA-PaLI-X Generalist Model0.65560.71070.72280.56420.41970.62000.82370.67100.60950.52460.51590.49880.3372
2024-09-08neetolab-sota-v10.61950.66200.70210.48140.45130.60150.76520.55050.57760.49960.46760.55280.3491
2021-04-11Applica.ai TILT0.61200.67650.64190.43910.38320.59170.79160.45450.56540.44800.48010.49580.2652
2024-07-22Snowflake Arctic-TILT 0.8B0.56950.62740.60740.41230.36530.54780.75300.42040.51090.44100.43500.50420.2238
2023-08-20PaLI-X (Google Research, Single Generative Model)0.54770.59400.69500.41220.35340.51450.68910.63730.50400.40130.42900.40530.3091
2024-09-03tiancaili0.49920.52930.71710.33170.35490.46330.61050.55440.46800.41390.40130.41130.2959
2024-05-21PaliGemma-3B (finetune, 896px)0.47750.52140.53720.33010.32200.45000.60570.42520.43770.36900.37420.39240.2507
2024-07-26loixc-vqa0.47150.50000.68150.32500.33090.45210.58530.41080.43640.36120.40060.39190.2505
2024-10-09llama3-qwenvit0.43290.50770.51620.23290.16500.42070.55680.47850.40530.30140.33710.13110.2118
2023-10-09nnrc_udop_2240.42990.47160.52790.24100.27850.37400.57550.34750.39440.33470.29970.35830.1866
2024-05-21PaliGemma-3B (finetune, 448px)0.40470.42750.58010.25600.30070.40100.48530.38980.37420.31780.35300.33600.2517
2022-09-18pix2struct-large0.40010.43080.48390.20590.31730.38330.52560.25720.37260.32830.27620.41980.2017
2024-07-31tixc-vqa0.39750.42640.60920.26200.24960.36930.47980.38260.37040.31720.35710.29270.1965
2021-04-09IG-BERT (single model)0.38540.41810.44810.21970.28490.33730.50160.30130.37060.33470.29390.35640.2000
2022-09-18pix2struct-base0.38200.41450.43810.16550.30140.33510.49710.23800.36320.32570.23440.40360.1888
2024-10-09llama3-internvit0.37490.42940.57150.16410.16270.37210.45800.47410.33850.23500.33290.11140.2109
2024-04-23dolma_multifinetuning0.36330.38320.56600.20450.26570.32840.45700.40420.33290.21740.31170.27310.2491
2021-04-11NAVER CLOVA0.32190.39960.23170.10640.10680.26530.44880.18780.30950.32310.20200.14800.0695
2021-04-10Ensemble LM and VLM0.28530.33370.41810.07480.11690.24390.36490.23310.26450.28450.25800.16280.0647
2024-05-21PaliGemma-3B (finetune, 224px)0.28460.28880.50240.15670.24250.26750.32060.31640.26090.24060.29790.20250.2730
2021-11-09LayoutLMv2 LARGE0.28290.34300.27630.06410.11140.24490.38550.14400.26010.31100.18970.11300.1158
2022-09-20BROS_BASE (WebViCoB 1M)0.28090.34360.24850.02770.13030.25450.36200.13180.27670.28860.22070.17450.0854
2022-03-03InfographicVQA paper model0.27200.32780.23860.04500.13710.24000.36260.17050.25510.22050.18360.15590.1140
2021-04-05BERT fuzzy search0.20780.26250.23330.07390.02590.18520.29950.08960.19420.17090.18050.01600.0436
2021-04-10BERT0.16780.21490.21170.01260.01520.14790.24500.10540.15050.17680.15780.01580.0185
2024-07-1307100.14070.14490.21810.06740.12520.12940.16120.13340.13680.10410.12610.13970.1072

Ranking Graphic