Description: using the open source code from
Authors: Wuheng Xu, Changxu Cheng, Bohan Li
Description: Area block feature information using images on multiple scales.This model has 4 scales and 8 branches.We used three training sets(mlt17, mlt19, mlt19val).
Description: Recognition model: Based on Transformer with backbone ResNet50. A voting process is done to identify the language of recognized transcript. Train-set: 2017 MLT task2 train-set & 2019 MLT task2 train-set & 2019 MLT Synthetic dataset.