method: DocBlipVQA2023-04-16

Authors: RenZhou,QiaolingDeng,XinfengChang,LuyanWang,XiaochenHu,HuiLi, YaqiangWu

Affiliation: Lenovo Research

Description: We integrated the prediction outputs from the UDOP model and Blip2 to enhance our results,and we optimized the image encoder and included page number features to address the challenge of multi-page documents.