Method: DocGptVQA - Task 1 - DUDE - Document UnderstanDing of Everything 😎

method: DocGptVQA2023-04-20

Authors: RenZhou,QiaolingDeng,XinfengChang,LuyanWang,XiaochenHu,HuiLi, YaqiangWu

Affiliation: Lenovo Research

Description: We integrated the prediction outputs from the UDOP model and Blip2 to enhance our results,and we optimized the image encoder and included page number features to address the challenge of multi-page documents. GPT to generate python-like modular programs.