method: yyds2022-07-21

Authors: yuanyeyyds

Affiliation: yyds

Description: Model: For text detector, we used DBNet++. For text recognizer, we use VIT as the backbone and our model has two output head, one use ctc mechanism and the other use attention mechanism. The ensemble of these two output is used as the final result
Data: our text detector only used the official training data. for text recognizer training, we used the official data and extra 10M synthetic data