Authors: Tianwei Wang*, Jiaxin Zhang*, Yichao Huang*, Jiapeng Wang, Yan Li, Canjie Luo, Kai Ding, Lianwen Jin (*equal contribution)
Description: We utilize a CRNN-based model to predict text strings. The backbone is ResNet-50. Apart from the official training dataset, we synthesize 2 million samples for training. Finally, a sentence-level lexicon, which is extracted from the training dataset, is used to restrict the predictions.