method: oCLIP2022-07-20
Authors: Zhangzi Zhu, Yu Hao, Wenqing Zhang, Chuhui Xue, Song Bai
Affiliation: bytedance
Description: For detection, we first pre-train our Deformable ResNet-101 and VAN-Large by using oCLIP on the training set. We then train TESTR, PAN and Mask TextSpotter with different backbones by using the pre-trained model. Finally, we combine results from different methods, different backbones, and different scales together.