- Task 3 - Infographics VQA - Method: InternLM-XComposer2-4KHD-7B
- Method info
- Samples list
- Per sample details
method: InternLM-XComposer2-4KHD-7B2024-04-02
Authors: InternLM-XComposer Team
Affiliation: Shanghai Artificial Intelligence Laboratory
Description: InternLM-XComposer2-4KHD: Scaling the Resolution of Large Vision-Language Models up to 4KHD and Beyond
1. Generalist model with 4H Resolution understanding.
2. End-to-end model, no OCR pipeline.
3. No specialist fine-tuning.
see https://github.com/InternLM/InternLM-XComposer for more details.