- Task 1 - Key Information Localization and Extraction - Method: Baseline+Ensemble+Pseudo+Post-Processing
- Method info
- Samples list
- Per sample details
method: Baseline+Ensemble+Pseudo+Post-Processing2023-05-16
Authors: UIT@AICLUB_TAB
Affiliation: UIT - University of Information Technology - VNUHCM
Email: 22520121@gm.uit.edu.vn
Description: Our approach is based on the checkpoint baseline with some improvements. We trained/used models:
1. Model RoBERTa base from scratch using FGM and Lion Optimizer with synthetic data for 30 epochs, after that, I trained on annotated data.
2. Model RoBERTa ours (checkpoint) with Lion Optimizer
3. Model RoBERTa base (checkpoint)
After that, we ensemble them by unioning words that are marked at 1 of 55 field type, post-processing.
After that, we used the ensembled model to predict unlabeled data, we have pseudo data, use them to pre-train 3 models, and train on annotated data after that.
Pipeline: https://ibb.co/4MWcXgb