Authors: Kai Yang, Ye Wang, Bin Wang, Wentao Liu, Xiaolu Ding, Jun Zhu, Ming Chen, Peng Yao, Zhixin Qiu
Affiliation: CCB Financial Technology Co. Ltd, China
Description: 1. Data Analysis
This competition provided 5000 pieces of training data officially. Upon analyzing the data, we found that it can be classified into four categories: round, oval, square, and triangular, with the round and oval categories being the primary ones. The training set contains various conditions, including multi-directional rotations, uneven colors, overlapping seals, and indistinct seal patterns.
2. Data Processing
When it comes to data analysis, we began by re-annotating the training set images and enlarging them to squares. We then rotated the data and produced a total of 15,000 images. Data generation was carried out on difficult samples, including those with overlapping or blurry stamps. Prior to generating the seal data, we gathered a large number of company and organization names from the internet. Then, we generated the rotation angle and position of each individual character based on its length and merged them into the seal's background image. Moreover, we output the coordinates of the outer edge points of the text. To create a more realistic representation of seals in the generated data, we incorporated various colors, fonts, backgrounds, and textures. The base image for each seal was created by randomly cropping backgrounds, and we used RGBA format during data generation to allow for control over the color depth of the seal by adding a transparency channel. We also included two types of seal borders: solid and fragmented.
3. Model Introduction
In this segmentation task, we employed a “voting ensemble” method to detect the content of the seal title. Five models are utilized in the method, namely Mask R-CNN, K-Net, Segformer, Segmenter, and UperNet. Each model generates a mask. And we utilize a majority vote to derive the final mask, which allows us to identify the seal title area on the mask.