Results - ICDAR 2023 Competition on Reading the Seal Title

method: Dao Xianghu light of TianQuan2023-03-19

Authors: Kai Yang, Ye Wang, Bin Wang, Wentao Liu, Xiaolu Ding, Jun Zhu, Ming Chen, Peng Yao, Zhixin Qiu

Affiliation: CCB Financial Technology Co. Ltd, China

Description: 1. Data Analysis
This competition provided 5000 pieces of training data officially. Upon analyzing the data, we found that it can be classified into four categories: round, oval, square, and triangular, with the round and oval categories being the primary ones. The training set contains various conditions, including multi-directional rotations, uneven colors, overlapping seals, and indistinct seal patterns.
2. Data Processing
When it comes to data analysis, we began by re-annotating the training set images and enlarging them to squares. We then rotated the data and produced a total of 15,000 images. Data generation was carried out on difficult samples, including those with overlapping or blurry stamps. Prior to generating the seal data, we gathered a large number of company and organization names from the internet. Then, we generated the rotation angle and position of each individual character based on its length and merged them into the seal's background image. Moreover, we output the coordinates of the outer edge points of the text. To create a more realistic representation of seals in the generated data, we incorporated various colors, fonts, backgrounds, and textures. The base image for each seal was created by randomly cropping backgrounds, and we used RGBA format during data generation to allow for control over the color depth of the seal by adding a transparency channel. We also included two types of seal borders: solid and fragmented.
3. Model Introduction
In this segmentation task, we employed a “voting ensemble” method to detect the content of the seal title. Five models are utilized in the method, namely Mask R-CNN, K-Net, Segformer, Segmenter, and UperNet. Each model generates a mask. And we utilize a majority vote to derive the final mask, which allows us to identify the seal title area on the mask.

Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask R-CNN. In ICCV. 2980–2988. 2017.

Robin Strudel, Ricardo Garcia, Ivan Laptev, Cordelia Schmid. Segmenter: Transformer for semantic segmentation. In ICCV, 7262–7272, 2021.

Source code

Source code 2

method: SPDB LAB2023-03-16

Authors: Jie Li 、Wei Wang、Yuqi Zhang、Ruixue Zhang、Yiru Zhao、Danya Zhou、Di Wang、Dong Xiang、Hui Wang、Min Xu、Pengyu Chen、Bin Zhang、Chao Li、Shiyu Hu、Songtao Li、Yunxin Yang

Affiliation: Shanghai Pudong Development Bank

Email: zhangyq26@outlook.com、wangdee0805@139.com、lij131@spdb.com.cn

Description: Circle seals, ellipse seals, rectangle seals and triangle seals were trained with different method in task 1. The seal title detection model is trained using the provided training data and the synthetic data in the team, and the detection model is PANNet.The synthetic data is based on the style analysis of training data, and more than 20,000 training samples are synthesized in total.Two different PANNet models based on Circle and ellipse seals，rectangle and triangle seals are trained respectively for test set testing.

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Source code

method: Aaaaa_v32023-03-20

Authors: Wudao, Liaoming

Affiliation: cmb

Description: DBnet++ with mobileone s3 backbone. In training phase, using 2w+ generate circle seal and rectangle seal images. In inference, multiscale the images with 320 size and 480 size, merging the results by picking the best confidence results.

M. Liao, Z. Zou, Z. Wan, C. Yao and X. Bai, "Real-Time Scene Text Detection With Differentiable Binarization and Adaptive Scale Fusion," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 919-931, 1 Jan. 2023, doi: 10.1109/TPAMI.2022.3155612.

Vasu, Pavan & Gabriel, James & Zhu, Jeff & Tuzel, Oncel. (2022). An Improved One millisecond Mobile Backbone. 10.48550/arXiv.2206.04040.

Source code

Source code 2

Ranking Table

Description Paper Source Code

Date	Method	Precision-0.7	Recall-0.7	Hmean-0.7	Precision	Recall	Hmean
2023-03-19	Dao Xianghu light of TianQuan	99.06%	99.06%	99.06%	99.92%	99.92%	99.92%
2023-03-16	SPDB LAB	97.60%	97.60%	97.60%	99.92%	99.92%	99.92%
2023-03-20	Aaaaa_v3	97.34%	97.32%	97.33%	99.36%	99.34%	99.35%
2023-03-14	Aaaaa_v1	96.72%	96.64%	96.68%	99.16%	99.08%	99.12%
2023-03-19	PAN++ with Res101	92.22%	92.22%	92.22%	97.48%	97.48%	97.48%
2023-03-07	detect_test	85.96%	85.96%	85.96%	99.00%	99.00%	99.00%
2023-03-08	Keypoint-based curved text detection	82.42%	82.42%	82.42%	92.72%	92.72%	92.72%
2023-03-20	Mask way	1.28%	1.28%	1.28%	4.30%	4.30%	4.30%

Inactive evaluations

method: Dao Xianghu light of TianQuan2023-03-19

method: SPDB LAB2023-03-16

method: Aaaaa_v32023-03-20

Ranking Table

Ranking Graphic

Ranking Graphic