method: oCLIP_v22022-07-20

Authors: Zhangzi Zhu, Yu Hao, Wenqing Zhang, Chuhui Xue, Song Bai

Description: For detection, we first pre-train our Deformable ResNet-101 by using oCLIP on the provided training set. We then train TESTR, PAN and Mask TextSpotter with different backbones by using the pre-trained model. Finally, we combine results from different methods, different backbones, and different scales together. For recognition, we adopt SCATTER.