method: transformer_v12020-10-15

Authors: weihang.hwh

Affiliation: dtwave technology

Description: We only use the same model as we used in the ReCTS 2019 competition. The model is based on transformer architecture with 2D attention.We use beam search when decoding image.Our data augmentation tricks include jitter, random rotation, random scale, motion blur, random perspectition and so on. The datasets include Art, ReCTS, RCTW, LSVT, MLT, CTW, COCO_TEXT, CASIA-10k and MTWI. No private dataset and no ensemble trick is used.