method: MapText Strong Pipeline2025-03-31

Authors: Yu Xie, Canhui Xu, Jielei Zhang, Pengyu Chen, Weihang Wang, Yuchen He, Peiyi Li, Yihan Meng, Longwen Gao

Affiliation: Bilibili Inc., QUST

Description: For the English MapText recognition task, we employed DNTextSpotter, a novel denoising training method based on DeepSolo. For the Chinese MapText recognition task, we utilized DeepSolo. Data augmentation techniques, including cropping, scaling, and adjustments to saturation and contrast, were applied. Pre-training was conducted using available real-world datasets such as TextOCR, TotalText, IC15, and MLT2017. Post-processing methods were also adopted.

@article{xie2024dntextspotter, title={DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training}, author={Xie, Yu and Qiao, Qian and Gao, Jun and Wu, Tianxiang and Fan, Jiaqing and Zhang, Yue and Zhang, Jielei and Sun, Huyang}, journal={arXiv preprint arXiv:2408.00355}, year={2024} }

@inproceedings{ye2023deepsolo, title={Deepsolo: Let transformer decoder with explicit points solo for text spotting}, author={Ye, Maoyuan and Zhang, Jing and Zhao, Shanshan and Liu, Juhua and Liu, Tongliang and Du, Bo and Tao, Dacheng}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={19348--19357}, year={2023} }

Source code

method: Self-Sequencer2025-03-28

Authors: Mengjie Zou, Tianhao Dai, Remi Petitpierre, Beatrice Vaienti, Frederic Kaplan, Isabella di Lenardo

Affiliation: EPFL, Swiss Federal Institute of Technology in Lausanne

Email: remi.petitpierre@epfl.ch

Description: For word detection and recognition, our approach relies on DeepSolo, whose architecture is derived from Detection Transformers (DETR). In short, DeepSolo extracts hierarchical visual features from map images and processes them through an encoder-decoder architecture to detect words as segments bounded by Bézier curves. The model specifically returns four control points of central Bézier curves per word and then uniformly samples query points along these curves to segment, classify, and delineate each text instance precisely. To resolve duplicate word detections, we implement a postprocessing step inspired by Non-Maximum Suppression. It involves calculating the Fréchet distance between the Bézier curves of potential duplicate word pairs, or "directional synonyms", and merging those below a defined threshold. More details on the model, algorithms, and specific implementation are provided in our separate article [1].

The model training leverages several real and synthetical datasets: ICDAR MapText [2], MapKuratorHuman [3], SynthMap [3], and Paris and Jerusalem Maps Text Dataset [4].

References:

[1] Zou, M., Dai, T., Petitpierre, R., Vaienti, B., Kaplan, F., & di Lenardo I. (2025). Recognizing and Sequencing Multi-word Texts in Maps Using an Attentive Pointer.

[2] Lin, Y., Li, Z., Chiang Y.Y., & Weinman J. (2024). Rumsey Train and Validation Data for ICDAR'24 MapText Competition (Version 1.3). Zenodo. https://doi.org/10.5281/zenodo.11516933

[3] Kim, J., Li, Z., Lin Y., Namgung, M., Jang, L., & Chiang Y.Y. (2023) The mapKurator System: A Complete Pipeline for Extracting and Linking Text from Historical Maps. In: Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems. https://arxiv.org/abs/2306.17059

[4] Dai, T., Johnson, K., Petitpierre, R., Vaienti, B., & di Lenardo, I. (2025). Paris and Jerusalem Maps Text Dataset (Version 1.0.0). Zenodo. https://doi.org/10.5281/zenodo.14982662

method: Baseline TESTR Finetuned2025-04-29

Authors: Organizers

Affiliation: ICDAR’25 RRC-MapText

Description: TESTR checkpoint (polygon prediction head, TotalText tuned) is further finetuned on the competition training data available for each data set.

Ranking Table

Description Paper Source Code
OverallWords
DateMethodH-MeanPrecisionRecallTightnessChar AccuracyChar QualityDet QualityF-Score
2025-03-31MapText Strong Pipeline91.13%95.88%91.84%83.75%94.04%73.89%78.57%93.82%
2025-03-28Self-Sequencer90.30%91.52%89.13%86.14%94.86%73.79%77.79%90.31%
2025-04-29Baseline TESTR Finetuned89.53%89.14%90.04%86.28%92.92%71.82%77.30%89.59%
2025-01-10[Baseline MapText '24] MapText Detection and Recognition Strong Pipeline89.26%96.16%85.01%83.27%93.97%70.61%75.14%90.24%
2025-01-10[Baseline MapText '24] MapTest87.37%90.47%88.23%81.82%89.51%65.42%73.09%89.34%
2025-04-19CREPE + BezierCurve84.93%87.10%86.53%73.62%95.47%61.02%63.91%86.81%
2025-01-10[Baseline MapText'24] MapTextSpotter84.52%92.61%81.51%81.44%83.46%58.94%70.62%86.71%
2025-01-10[Baseline MapText'24] DS-LP77.55%71.76%78.93%71.63%90.83%48.90%53.84%75.17%
2025-01-10[Baseline MapText'24] TESTR Checkpoint74.61%71.85%66.90%79.55%82.10%45.26%55.12%69.29%
2025-04-20Word-Level Text Detection and Recognition on Historical Maps Using Preprocessing and PaddleOCR59.46%59.54%40.01%73.89%83.71%29.60%35.36%47.86%
2025-04-19YOLOv8_ViTAE_PolygonDetector27.42%61.75%9.64%75.97%77.96%9.88%12.67%16.68%

Ranking Graphic

Ranking Graphic

Ranking Graphic

Ranking Graphic

Ranking Graphic