method: TextFuseNet2020-07-31

Authors: Jian Ye, Zhe Chen, Juhua Liu and Bo Du

Affiliation: Wuhan University, The University of Sydney

Email: liujuhua@whu.edu.cn

Description: Arbitrary shape text detection in natural scenes is an extremely challenging task. Unlike existing text detection approaches that only perceive texts based on limited feature representations, we propose a novel framework, namely TextFuseNet, to exploit the use of richer features fused for text detection. More specifically, we propose to perceive texts from three levels of feature representations, i.e., character-, word- and global-level, and then introduce a novel text representation fusion technique to help achieve robust arbitrary text detection. The multi-level feature representation can adequately describe texts by dissecting them into individual characters while still maintaining their general semantics. TextFuseNet then collects and merges the texts’ features from different levels using a multi-path fusion architecture which can effectively align and fuse different representations. In practice, our proposed TextFuseNet can learn a more adequate description of arbitrary shapes texts, suppressing false positives and producing more accurate detection results. Our proposed framework can also be trained with weak supervision for those datasets that lack character-level annotations. Experiments on several datasets show that the proposed TextFuseNet achieves state-of-the-art performance. Specifically, we achieve an F-measure of 94.3% on ICDAR2013, 92.1% on ICDAR2015,87.1% on Total-Text and 86.6% on CTW-1500, respectively.

method: TH2020-01-22

Authors: Tsinghua University and Hyundai Motor Group AIRS Company

Email: Shanyu Xiao: xiaosy19@mails.tsinghua.edu.cn

Description: We have built an end-to-end scene text spotter based on Mask R-CNN & Transformer. The ResNet-50 backbone and multiscale training/testing are used.

method: JDAI2019-08-13

Authors: Jingyang Lin, Jiajia Geng, Rongfeng Lai

Description: We are from JDAI and Sun Yat-Sen University. It is a strong scene text detection baseline built upon Mask R-CNN architecture.

Ranking Table

Description Paper Source Code
DateMethodRecallPrecisionHmean
2020-07-31TextFuseNet90.56%93.96%92.23%
2020-01-22TH89.46%94.03%91.69%
2019-08-13JDAI90.85%92.50%91.67%
2019-03-11Alibaba-PAI V289.41%93.32%91.32%
2019-08-08Eleme-AI89.31%93.03%91.13%
2018-09-07Sogou_MM90.03%92.21%91.11%
2018-07-03Baidu VIS v288.11%94.04%90.98%
2019-05-14ArtDet88.40%92.91%90.60%
2018-01-31Alibaba-PAI87.34%93.84%90.47%
2018-01-22FOTS87.92%91.85%89.84%
2018-11-15Pixel-Anchor(Multiscale)86.95%92.28%89.54%
2018-03-05HoText_v183.58%96.34%89.51%
2019-08-07 CM-CV&AR87.53%91.49%89.47%
2020-08-12RRPN++ (single scale)87.19%91.84%89.45%
2019-04-11Shape-Aware Arbitrary Text Detector87.77%91.20%89.45%
2020-09-26DCLNet87.05%90.31%88.65%
2017-09-15Baidu VIS 83.39%93.62%88.21%
2017-12-19HIK_OCR_V184.59%91.94%88.11%
2019-05-14ArtDet (single scale)84.93%91.40%88.05%
2018-06-26SPCNet_TongJi & UESTC (single scale)86.71%88.94%87.81%
2017-09-12SenseTime V384.83%90.82%87.73%
2019-05-17SEG-PIXEL-PAN (single-scale)85.32%90.22%87.70%
2018-11-15Pixel-Anchor(single scale)87.05%88.32%87.68%
2018-05-18PSENet_NJU_ImagineLab (single-scale)85.22%89.30%87.21%
2017-09-11Dahua-OCR V383.44%91.31%87.19%
2018-12-03CV_OCR_NOOB(single-scale)82.96%91.55%87.04%
2019-04-08CRAFT84.26%89.79%86.93%
2018-01-16HappyCCL83.87%89.10%86.41%
2017-12-25Alibaba-PAI82.91%89.55%86.10%
2017-07-12Tencent-DPPR81.80%90.71%86.03%
2019-05-30Lanm84.26%87.28%85.74%
2017-07-04Baidu IDL v381.99%89.82%85.73%
2018-07-09BUPT ICYBEE81.99%89.73%85.69%
2017-05-13SRC-B-MachineLearningLab-v382.81%88.66%85.64%
2020-04-18MMLab-PolarMask++(Single Scale)83.53%87.36%85.40%
2017-09-13PixelLink83.77%86.65%85.19%
2018-12-02EPTN-SJTU80.93%89.13%84.83%
2017-07-26TextBoxes++80.79%89.11%84.75%
2018-01-04crpn80.69%88.77%84.54%
2017-11-26blank_net_v180.79%88.46%84.45%
2017-10-22FTDN-SJTU-v280.93%87.69%84.18%
2017-09-03CCFLAB_FTSN80.07%88.65%84.14%
2017-09-07cvte_zju_v284.40%83.64%84.02%
2018-05-07textDetection84.83%83.19%84.00%
2017-02-17NLPR-CASIA82.76%84.76%83.75%
2017-06-11SCUT_DMPNet_pro83.73%83.53%83.63%
2018-08-09YY-tl_final82.43%84.63%83.51%
2017-09-04FTDN-SJTU80.55%86.59%83.46%
2019-07-15stela78.57%88.70%83.33%
2019-04-10EAST-VGG1681.27%84.36%82.79%
2018-01-10HoText_v079.20%86.40%82.64%
2020-08-14DAL(multi-scale)80.45%84.35%82.36%
2018-04-28train twice79.20%85.77%82.35%
2018-01-08q80.74%83.81%82.25%
2017-08-18Dahua-OCR v179.30%85.16%82.12%
2018-04-26east-modified78.57%85.40%81.85%
2017-02-22SRC-B-MachineLearningLab-v279.78%83.64%81.67%
2019-08-02PyTorch re-implementation of EAST74.48%90.26%81.61%
2020-07-12east improved76.50%87.40%81.59%
2020-08-13DAL79.49%83.68%81.53%
2017-08-14MultDet78.48%84.32%81.30%
2017-01-21CNN based model79.68%82.34%80.99%
2017-07-31EAST reimplemention with resnet 5077.32%84.66%80.83%
2017-12-11Huaibeicun TextDetector77.32%84.66%80.83%
2017-06-19Megvii-EAST78.33%83.27%80.72%
2017-10-18ICT76.55%85.21%80.65%
2017-07-31dengdan_zju_cad80.36%80.43%80.39%
2017-08-11SegRPN75.83%85.18%80.23%
2017-07-12CCFLAB_v177.03%83.68%80.22%
2017-01-23RRPN-477.13%83.52%80.20%
2019-03-08R2CNN++ (single scale)78.86%81.33%80.08%
2017-11-21ITN_VGG74.15%85.70%79.50%
2017-01-13MSRA_v174.10%85.22%79.27%
2019-07-11AFCTPN76.02%82.71%79.23%
2018-05-07vm2474.92%82.63%78.59%
2018-04-27my_east_with_atrous75.25%81.96%78.46%
2020-09-28Huawei_GDE_AI77.85%77.26%77.55%
2020-08-15R-RetinaNet77.80%77.25%77.52%
2016-10-28RRPN-373.23%82.17%77.44%
2017-07-13knet072.46%83.06%77.40%
2020-05-24East reimplementation with vgg16_bn71.21%84.32%77.21%
2017-01-19SRC-B-MachineLearningLab69.86%86.11%77.14%
2017-01-13SCUT_DMP_v276.89%77.00%76.95%
2017-02-12SSTD73.86%80.23%76.91%
2017-07-13knet176.89%76.01%76.45%
2017-07-14zju_cvte_seglink_51272.85%80.22%76.36%
2017-07-09CCFLAB75.06%77.25%76.14%
2018-08-06YY_tl_v169.23%83.65%75.76%
2017-01-14hust_orientedText76.50%74.74%75.61%
2017-10-12DDR_v473.57%77.76%75.61%
2016-10-25Baidu IDL v272.75%77.41%75.01%
2017-10-24BJUT_FNIC70.73%79.45%74.83%
2017-05-01TsinghuaOCR67.74%83.21%74.68%
2017-10-12DDR_v374.96%72.02%73.46%
2017-10-12DDR_v274.19%69.32%71.67%
2016-09-23HUST_OrientedText74.15%69.03%71.49%
2016-10-25SCUT_DMPNet68.22%73.23%70.64%
2016-06-23Baidu IDL68.22%71.53%69.84%
2018-07-27YY-tl67.21%67.80%67.50%
2015-11-11Megvii-Image++56.96%72.40%63.76%
2017-10-12DDR_v172.80%55.98%63.29%
2019-07-23std++(single-scale)56.67%71.64%63.28%
2020-05-04RESULTS66.39%60.30%63.20%
2016-11-08CTPN51.56%74.22%60.85%
2018-04-27erTree_mxFlow_sigleDir55.32%66.07%60.22%
2018-12-27fast_ret_sh_0254.07%65.87%59.39%
2017-09-21UCAS_CMVT349.16%66.38%56.49%
2020-09-05zoomtext47.76%68.65%56.33%
2016-01-15MCLAB_FCN43.09%70.81%53.58%
2019-01-27rrpn ssprpns50.79%50.55%50.67%
2017-09-08CMVT_resNet42.99%60.26%50.18%
2015-04-03Stradvision-236.74%77.46%49.84%
2019-01-27rrpn example50.02%49.43%49.72%
2018-03-29resNet-32r-3dp46.65%52.95%49.60%
2015-04-02StradVision-146.27%53.39%49.57%
2015-12-15CASIA_USTB-Cascaded39.53%61.68%48.18%
2015-04-02NJU_Text_Version435.82%72.73%48.00%
2015-04-01NJU Text (Version2)36.25%70.44%47.87%
2017-08-11textbox36.74%65.66%47.11%
2015-03-31AJOU46.94%47.26%47.10%
2015-03-30NJU_Text_Version138.32%56.33%45.62%
2015-03-31NJU_Text_Version237.46%54.14%44.28%
2017-10-12TextFCN V237.02%54.19%43.99%
2015-04-02NJU_Text_Version537.84%51.41%43.59%
2019-07-08psenet45.74%38.84%42.01%
2015-04-02HUST MCLAB (VER3.0)37.79%44.00%40.66%
2015-04-02HUST_MCLAB_VER1.034.81%47.47%40.17%
2015-04-02HUST_MCLAB_VER.034.09%46.49%39.33%
2015-04-02HUST_MCLAB_VER2.034.09%46.49%39.33%
2015-04-02Deep2Text-MO32.11%49.59%38.98%
2015-04-03CNN Proposal Based MSER34.42%34.71%34.57%
2015-04-03TD-IMU25.28%34.56%29.20%
2015-04-03TextCatcher-2 (LRDE)34.81%24.91%29.04%
2018-07-1111121.71%35.21%26.86%
2018-07-1112322.53%33.07%26.80%
2018-07-112015_test_result_2000_2200_0.823.54%28.85%25.93%
2018-04-25Full YOLO22.00%31.54%25.92%
2018-07-0812345618.05%20.47%19.19%
2018-04-25Tiny YOLO14.54%14.39%14.47%
2015-04-01imagine5.44%12.97%7.67%

Ranking Graphic