Results - Out of Vocabulary Scene Text Understanding

method: CLOVA OCR DEER2022-07-20

Authors: Taeho Kil, Seonghyeon Kim, Sukmin Seo

Affiliation: Clova AI OCR Team, NAVER/LINE Corp.

Description: An end-to-end scene text spotter based on CNN backbone, deformable transformer encoder, location decoder and text decoder. The location decoder based on the segmentation method (Differentiable Binarization) detects text regions, and text decoder based on the deformable transformer decoder recognizes each instances from image features and detected location information. We use not multiple ensemble models but a single model, and all sub-modules are end-to-end trainable. We use real datasets provided by this challenge (train + val split), and synthetic dataset. Since cocotext dataset has a lots of label noises (with regards to alphabet capitalization), we refined the cocotext dataset annotation using teacher model (trained without cocotext).

@article{kim2022deer, title={DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting}, author={Kim, Seonghyeon and Shin, Seung and Kim, Yoonsik and Cho, Han-Cheol and Kil, Taeho and Surh, Jaeheung and Park, Seunghyun and Lee, Bado and Baek, Youngmin}, journal={arXiv preprint arXiv:2203.05122}, year={2022} }

method: e2e text spotter - final version2022-07-21

Authors: Taeho Kil, Seonghyeon Kim, Sukmin Seo

Affiliation: Clova AI OCR Team, NAVER/LINE Corp.

method: CLOVA OCR DEER2022-07-19

Authors: Sukmin Seo, Taeho Kil, Seonghyeon Kim

Affiliation: Clova AI OCR Team, NAVER/LINE Corp.

Description: An end-to-end scene text spotter based on CNN backbone, deformable transformer encoder, location decoder and text decoder. COCOTEXT labels were fixed by pseudo labeling. (upper case, lower case)
batch : 28
iter : 400k
aug : RandomRotate, RandomResizeScale, RandomCrop, ColorJitter
lr : 3e-4
weight_decay : 1e-6
without ensemble

Ranking Table

Description Paper Source Code

			All			OOV			IV
Date	Method	Hmean	Precision	Recall	Hmean	Precision	Recall	Hmean	Precision	Recall	Hmean
2022-07-20	CLOVA OCR DEER	0.4242	0.6716	0.5213	0.5870	0.1856	0.4876	0.2689	0.6450	0.5259	0.5794
2022-07-21	e2e text spotter - final version	0.4239	0.6717	0.5204	0.5864	0.1858	0.4872	0.2690	0.6451	0.5249	0.5788
2022-07-19	CLOVA OCR DEER	0.4057	0.6399	0.5243	0.5764	0.1624	0.4800	0.2427	0.6129	0.5303	0.5686
2022-07-20	DB_threshold2_TRBA_CocoValid	0.3910	0.6408	0.4993	0.5613	0.1526	0.4229	0.2243	0.6160	0.5096	0.5578
2022-07-21	E2E_MASK	0.3213	0.4790	0.5414	0.5083	0.0864	0.4673	0.1458	0.4520	0.5514	0.4968
2022-07-21	yyds	0.2868	0.5153	0.3554	0.4207	0.1063	0.3336	0.1612	0.4857	0.3583	0.4124
2022-07-21	yyvis	0.2848	0.5120	0.3531	0.4180	0.1054	0.3326	0.1600	0.4823	0.3559	0.4095

Inactive evaluations

method: CLOVA OCR DEER2022-07-20

method: e2e text spotter - final version2022-07-21

method: CLOVA OCR DEER2022-07-19

Ranking Table

Ranking Graphic