R.R.C. Robust Reading Competition
  • Home (current)
  • Challenges
    • MapText2025
    • Comics Understanding2025
    • NoTeS2025
    • Occluded RoadText2024
    • MapText2024
    • HR-Ciphers2024
    • DocVQA2020-23
    • ReST2023
    • SVRD2023
    • DSText2023
    • DUDE 😎2023
    • NewsVideoQA2023
    • RoadText2023
    • DocILE2023
    • HierText2022
    • Out of Vocabulary2022
    • ST-VQA2019
    • MLT2019
    • LSVT2019
    • ArT2019
    • SROIE2019
    • ReCTS2019
    • COCO-Text2017
    • DeTEXT2017
    • DOST2017
    • FSNS2017
    • MLT2017
    • IEHHR2017
    • Incidental Scene Text2015
    • Text in Videos2013-2015
    • Focused Scene Text2013-2015
    • Born-Digital Images (Web and Email)2011-2015
  • Register
    DocVQA 2020-23
  • Overview
  • Tasks
  • Downloads
  • Results
  • My Methods
  • Organizers
  • Home
  • DocVQA
  • Results
  • Task 3 - Infographics VQA
  • Method: SMoLA-PaLI-X Generalist Model
  • Task 3 - Infographics VQA - Method: SMoLA-PaLI-X Generalist Model
  • Method info
  • Samples list
  • Per sample details

method: SMoLA-PaLI-X Generalist Model2023-12-07

Authors: SMoLA PaLI Team

Affiliation: Google Research

Description: Omni-SMoLA uses the Soft MoE approach to (softly) mix many multimodal low rank experts.

@article{wu2023smola, title={Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts}, author={Jialin Wu, Xia Hu, Yaqing Wang, Bo Pang, Radu Soricut}, journal={arXiv preprint arXiv:2312.00968}, year={2023} }