method: SRCB_LSVT2019-04-29

Authors: Yi Yu, Haiyang Guo, Xiaobing Wang, Yingying Jiang

Description: We use Mask R-CNN for text detection in our submission, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN. And Mask R-CNN is the state-of-the-art object detection framework now. Therefore, it is used here for text detection task. We use the Mask R-CNN in https://github.com/facebookresearch/Detectron and the backbone network is ResNext 101. Meanwhile, the object is text here and the number of classification classes is 2. Besides, polygon based NMS is used for post-processing to remove overlapped text regions.