method: CBL_OCR2022-01-14

Authors: Guokun Wang(王国坤), Jingyi Shen(沈静逸), Yue Wu(吴岳), Chang Zhou(周昌), Jianqiang Huang(黄建强)

Affiliation: Alibaba

Description: The Training method is based on transformer, which used in both encoder and decoder, multiple loss is combined for better accuracy. Our training data consists of serveral public datasets including CTW, LSVT, RCTW, ReCTS, Baidu Scene Text Recognition contest data. We Train the model on the whole dataset at first, and finetuned on the ReCTS for several epochs.