Results - Out of Vocabulary Scene Text Understanding

method: Semi Supervised Learning for OOV Text Recognition - NHN Cloud2023-06-27

Authors: Yeongyu Kim, Jeasung Park

Affiliation: NHN Cloud

Description: Semi-supervised learning can improve classification task performance by using unlabeled raw images. We investigated the effect of consistency or contrastive loss to train unlabeled images and used the original cross entropy loss for training labeled data. Train dataset provided by OOV organizer and synthetic data (MJ, ST) were used as labeled data, and word images cropped by text detector in open benchmark dataset (TextVQA, ST-VQA, ...) were used as unlabeled data.

method: Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss2023-03-04

Authors: Xuhua Ren, Lu Wang

Email: renxuhua1993@gmail.com

Description: Scene Text Recognition is an important component in various vision and language tasks. Recognizing out-of-vocabulary (OOV) words remains a challenge, and some studies suggest distinguishing between in-vocabulary (IV) and OOV words. To address this issue, we present two novel contributions. First, we propose a novel pseudo-label generation module that combines character detection and image inpainting modules to produce substantial training data. Second, we introduce an approach that optimizes the geodesic distance margins to reduce the impact of noisy samples in pseudo-labels on model convergence during training.

method: Optimized Transformer for OOV Text Recognition - NHN Cloud2023-06-16

Authors: Yeongyu Kim

Affiliation: NHN Cloud

Email: yeg.kim@nhn.com

Description: In the OOV (Out of Vocabulary) task, even word labels that do not exist in the training data must be recognized. We use adaptive positional encoding and our own macaron style transformer encoder. The permutate algorithm was applied to the decoder to make the most of the label combinations of the train data. Synthetic data (MJ, ST) are used along with the provided OOV training data.

Ranking Table

Description Paper Source Code

			IV		OOV
Date	Method	CRW	ED	CRW	ED	CRW
2023-06-27	Semi Supervised Learning for OOV Text Recognition - NHN Cloud	71.92%	92166	83.10%	36494	60.74%
2023-03-04	Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss	70.98%	100594	82.81%	42608	59.15%
2023-06-16	Optimized Transformer for OOV Text Recognition - NHN Cloud	70.82%	98646	81.92%	38529	59.71%
2024-02-26	Self-Supervised Learning for OOV Text Recognition - HuiGuan	70.38%	108657	81.92%	49290	58.84%
2022-07-21	OCRFLY_V2	70.31%	123947	81.02%	46048	59.61%
2023-02-27	HuiGuanV2	70.28%	110990	81.73%	49889	58.83%
2022-07-21	oov3decode	70.22%	94259	81.58%	40175	58.86%
2022-07-21	Vision Transformer Based Method	70.00%	94701	81.36%	40187	58.64%
2022-07-21	dat	69.90%	96513	80.78%	40082	59.03%
2022-07-20	ocrfly	69.83%	131232	80.63%	53243	59.03%
2022-07-21	ggui	69.80%	96597	80.74%	40171	58.86%
2022-07-21	spring	69.74%	96477	80.74%	40115	58.74%
2022-07-21	DataMatters	69.68%	96544	80.71%	40177	58.65%
2022-07-20	Cropped Recognition	69.65%	108766	80.63%	44958	58.68%
2022-07-21	MaskOCR	69.63%	108894	80.60%	44971	58.65%
2022-07-20	SCATTER	69.58%	113482	79.72%	43890	59.45%
2022-07-20	Summer	68.77%	103211	79.48%	42118	58.06%
2022-07-18	let me see see	68.46%	116503	80.81%	51165	56.11%
2022-07-20	Using only real data	68.28%	118185	79.28%	48517	57.27%
2023-04-07	test1	68.21%	123384	79.73%	56472	56.68%
2022-08-11	Baseline - SCATTER_v2	66.68%	128219	77.98%	52535	55.38%
2022-07-18	PTViT	66.29%	120449	77.52%	49410	55.06%
2022-07-20	demo	65.86%	124347	77.25%	48907	54.47%
2022-08-11	Baseline - CLOVA_v2	64.97%	138479	75.98%	54346	53.96%
2022-10-19	attn	64.02%	144275	76.47%	64446	51.57%
2022-07-19	TRBA_CocoValid_InfRotation2.0_SpaceRemove	63.98%	132781	77.76%	60693	50.20%
2022-07-19	HuiGuan	63.73%	162870	74.77%	68926	52.69%
2022-10-18	ctc	63.51%	141100	75.63%	63866	51.39%
2022-07-18	exp5_merge	54.87%	143070	70.93%	57786	38.81%
2022-07-20	EOCR: Ensemble Optical Character Recognition	46.66%	350166	55.30%	113317	38.02%
2022-07-17	BASELINE - Official Clova	44.47%	365566	52.61%	114101	36.34%
2022-07-19	NNRC	38.54%	405603	45.36%	136384	31.73%
2022-07-19	NN	37.17%	426074	43.38%	144032	30.97%
2022-07-18	Cluster Character Loss in Scene Text Recognition	31.06%	552570	47.40%	202087	14.73%
2022-07-20	Transformer for multi-language OCR	0.00%		0.00%		0.00%
2022-07-21	TEST	0.00%		0.00%		0.00%

Inactive evaluations

method: Semi Supervised Learning for OOV Text Recognition - NHN Cloud2023-06-27

method: Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss2023-03-04

method: Optimized Transformer for OOV Text Recognition - NHN Cloud2023-06-16

Ranking Table

Ranking Graphic