method: CLOVA OCR DEER2022-07-20

Authors: Taeho Kil, Seonghyeon Kim, Sukmin Seo

Affiliation: Clova AI OCR Team, NAVER/LINE Corp.

Description: An end-to-end scene text spotter based on CNN backbone, deformable transformer encoder, location decoder and text decoder. The location decoder based on the segmentation method (Differentiable Binarization) detects text regions, and text decoder based on the deformable transformer decoder recognizes each instances from image features and detected location information. We use not multiple ensemble models but a single model, and all sub-modules are end-to-end trainable. We use real datasets provided by this challenge (train + val split), and synthetic dataset. Since cocotext dataset has a lots of label noises (with regards to alphabet capitalization), we refined the cocotext dataset annotation using teacher model (trained without cocotext).

Authors: Taeho Kil, Seonghyeon Kim, Sukmin Seo

Affiliation: Clova AI OCR Team, NAVER/LINE Corp.

Description: An end-to-end scene text spotter based on CNN backbone, deformable transformer encoder, location decoder and text decoder. The location decoder based on the segmentation method (Differentiable Binarization) detects text regions, and text decoder based on the deformable transformer decoder recognizes each instances from image features and detected location information. We use not multiple ensemble models but a single model, and all sub-modules are end-to-end trainable. We use real datasets provided by this challenge (train + val split), and synthetic dataset. Since cocotext dataset has a lots of label noises (with regards to alphabet capitalization), we refined the cocotext dataset annotation using teacher model (trained without cocotext).

method: Detector Free E2E Method2022-07-21

Authors: Kim Seonghyeon

Affiliation: NAVER

Description: An detection free end-to-end text recognizer. CNN + Deformable Encoder & Decoder is used. Trained with training + valid data from RRC competitions and additional SynthText synthesized with MJSynth 90k dictionary, and for longer schedule.

Ranking Table

Description Paper Source Code
AllOOVIV
DateMethodHmeanPrecisionRecallHmeanPrecisionRecallHmeanPrecisionRecallHmean
2022-07-20CLOVA OCR DEER0.42420.67160.52130.58700.18560.48760.26890.64500.52590.5794
2022-07-21e2e text spotter - final version0.42390.67170.52040.58640.18580.48720.26900.64510.52490.5788
2022-07-21Detector Free E2E Method0.42010.66150.52440.58500.17970.49350.26350.63440.52860.5767
2022-07-20oCLIP_v20.41330.67370.46820.55240.20280.48420.28590.64410.46600.5408
2022-07-19CLOVA OCR DEER0.40570.63990.52430.57640.16240.48000.24270.61290.53030.5686
2022-07-20large_param30.40320.63440.53450.58020.15440.47210.23270.60820.54290.5737
2022-07-28fnnrcv30.39300.71860.39190.50720.22700.37810.28370.69330.39370.5022
2022-07-20DB_threshold2_TRBA_CocoValid0.39100.64080.49930.56130.15260.42290.22430.61600.50960.5578
2022-07-28tbd0.37940.69440.38810.49790.20690.37400.26640.66790.39000.4924
2022-07-21zyk0.35600.54630.53670.54150.11140.46870.18000.51900.54590.5321
2022-08-01cnnrcv40.35200.63270.38560.47910.16800.37940.23290.60330.38640.4711
2022-07-20BIT0.34890.54870.50650.52680.11270.44400.17980.52130.51490.5181
2022-08-11Baseline - GLASS0.34870.75800.30630.43630.24910.27230.26020.73680.31090.4373
2022-07-21E2E_MASK0.32130.47900.54140.50830.08640.46730.14580.45200.55140.4968
2022-07-21yyds0.28680.51530.35540.42070.10630.33360.16120.48570.35830.4124
2022-07-21yyvis0.28480.51200.35310.41800.10540.33260.16000.48230.35590.4095
2022-07-21sudokill-90.28340.51620.34080.41060.11030.33220.16560.48540.34200.4012
2022-07-21rickyyds0.28330.51590.34060.41030.11040.33270.16570.48500.34170.4009
2022-07-21PAN0.28130.50500.34810.41210.10500.33580.16000.47450.34980.4027
2022-07-21transformer0.28130.50470.34780.41180.10520.33680.16030.47400.34930.4022
2022-07-20CVO detection and recognition model0.26570.52270.29280.37530.11450.29020.16430.49120.29310.3671
2022-07-18Double-U 0.25000.48950.30510.37590.08470.24720.12620.46420.31290.3738
2022-07-20oCLIP0.24040.47720.07510.12980.41210.48420.44520.17460.01980.0355
2022-07-18DBNetpp0.20340.39420.27000.32050.05620.20750.08850.37150.27840.3183
2022-08-19Baseline - TextTranSpotter (Poly)0.18550.37370.24970.29940.04450.16370.07000.35480.26140.3010
2022-07-21BIT_OCR0.15900.26310.25850.26080.03980.25170.06870.23990.25940.2493
2022-08-11Baseline - POLYGON0.15880.31440.20930.25130.04440.17780.07100.29180.21360.2467
2022-08-11Baseline - BEZIER0.13830.28150.18120.22050.03740.15090.06000.26090.18530.2167
2022-07-28End-to-end OCR with transformer0.12570.25560.15270.19120.04380.17110.06980.22930.15020.1815
2022-07-20TH-DL0.09320.18390.13230.15390.02160.10870.03600.16890.13550.1504
2022-07-21End-to-end OCR with transformer0.00140.00260.00170.00200.00060.00320.00100.00200.00150.0017
2022-07-19NNRC0.00000.00000.00000.00000.00000.00000.00000.00000.00000.0000
2022-07-19NNRC_OCR0.00000.00000.00000.00000.00000.00000.00000.00000.00000.0000

Ranking Graphic