method: Ensemble of three task-specific Clova DEER2023-04-02

Authors: Sukmin Seo, Song Kayeon, Taeho Kil, Donghyun Kim

Affiliation: Naver Cloud

Description: Our model is an ensemble model consisting of three specialized models for word, line, and paragraph levels. We employed a pretrain-finetune strategy for the training approach. We pretrained the unified DEER(transformer) model for 500k steps using synthetic data. In the finetuning phase, we only used the hiertext dataset for training. Each of the three specialized models for word, line, and para levels were finetuned for 70k steps.