method: PaLI-X2023-05-31

Authors: Xi Chen et al

Affiliation: Google Research

Description: Scaled up PaLI-X model

Authors: Google

Description: Predictions generated using paligemma-ft-stvqa-896

Ranking Table

Description Paper Source Code
DateMethodScore
2023-12-06SMoLA-PaLI-X Generalist Model0.8603
2023-05-31PaLI-X0.8446
2024-05-21PaliGemma-3B (finetune, 896px)0.8440
2024-05-21PaliGemma-3B (finetune, 448px)0.8182
2022-09-21PaLI-17B0.7987
2022-09-21PaLI-15B0.7652
2022-07-26GIT2, Single Model0.7577
2022-09-20PaLI-3B0.6972
2022-05-27GIT, Single Model0.6964
2022-11-07unitnt blip0.6633
2022-09-20PreSTU CC15M-SplitOCR B+B0.6546
2024-05-21PaliGemma-3B (finetune, 224px)0.6329
2022-08-15LTG0.6089
2022-07-28TAG0.6019
2020-12-08TAP0.5967
2022-08-03ROQ0.5926
2022-11-25micro_600.5882
2022-03-15TWA0.5774
2024-01-12tap_visualbert0.5691
2020-09-09ssbaseline0.5500
2022-07-11danet0.5312
2022-10-22KgMr0.5273
2021-04-07m4c_demo0.5256
2022-12-08a0.5245
2021-07-20DXM_DI_AI_CV_NLP0.5077
2020-08-15TIG0.5051
2020-05-25SA-M4C0.5042
2021-02-10M4C-CVL0.5034
2020-07-28vm0.4916
2020-04-02masker0.4828
2020-03-01SMA0.4659
2019-11-02M4C (single model)0.4621
2019-11-15RUArt0.3108
2019-04-30VTA0.2820
2019-04-30QAQ0.2563
2019-04-22Clova AI OCR0.2155
2019-11-10MM-GNN0.2071
2019-04-29USTB-TQA0.1702
2019-04-29USTB-TVQA0.0952
2019-04-29Focus: A bottom-up approach for Scene Text VQA0.0882

Ranking Graphic