method: oCLIP2022-07-20

Authors: Zhangzi Zhu, Yu Hao, Wenqing Zhang, Chuhui Xue, Song Bai

Affiliation: bytedance

Description: For detection, we first pre-train our Deformable ResNet-101 and VAN-Large by using oCLIP on the training set. We then train TESTR, PAN and Mask TextSpotter with different backbones by using the pre-trained model. Finally, we combine results from different methods, different backbones, and different scales together.