method: TH-DL-v12019-06-03

Authors: Shanyu Xiao, Ruijie Yan, Meng Liu, Bowen Deng, Gang Yao, Liangrui Peng, Tsinghua University, Beijing, China

Description: We propose to use a one-stage detector to detect multiscale text regions, which incorporates a modified feature pyramid network (FPN) into the EAST [1] framework. For image preprocessing, a pixel level deformable module with linear spatial constraint is added before the FPN. In addition to the classification branch and RBOX branch in EAST, a center-ness regression branch [2] is adopted to suppress low-quality detected text boxes. In the training process, all loss maps are multiplied by the ground-truth center-ness map to reduce the loss weights at the text boxes’ border.

[1] Zhou X, Yao C, Wen H, et al. EAST: an efficient and accurate scene text detector, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 5551-5560. [2] Tian Z, Shen C, Chen H, et al. FCOS: Fully Convolutional One-Stage Object Detection. arXiv preprint arXiv:1904.01355, 2019.