Authors: Yangkun Lin, Tao Xu
Affiliation: Ant Group
Description: Our detector is based on Cascade Mask R-CNN. We use ConvNeXt-B as backbone. SynthText800k and
VISD10k are used to pretrain, and then we finetune on ArT, ICDAR2019-MLT and part of LSVT with multi-scale training. Multi-scale testing is used to get the result.