- Task 1 - Hierarchical Detection on Test Set - Method: Clova DEER
- Method info
- Samples list
- Per sample details
method: Clova DEER2023-04-01
Authors: Song Kayeon, Taeho Kil, Donghyun Kim, Sukmin Seo
Affiliation: Naver Cloud
Description: Our model passes through a CNN and deformable transformer encoder to extract multi-scale visual features for images. Then, an independent segmentation head is utilized to extract words, lines, and paragraphs. Additionally, text recognition results are achieved through a deformable transformer decoder. Our model performs both layout detection and OCR simultaneously. In summary, our single model performs both layout detection (task 1) and OCR (task 2) simultaneously.