Overview - ICDAR 2025 Competition on Historical Map Text Detection, Recognition, and Linking

Please check our Zenodo community and GitHub Organization for datasets, code, and extra artifacts.

Description

Text on digitized historical maps contains valuable information providing georeferenced political and cultural context, yet the wealth of information in digitized historical maps remains largely inaccessible due to their unsearchable raster format. This competition aims to address the unique challenges of detecting and recognizing textual information (e.g., place names) and linking words to form location phrases.

While the detection and recognition tasks share similarities with the long line of prior robust reading competitions [1,2], historical map text extraction presents challenges such as dense text regions, rotated and curved text and widely spaced characters which are not very common in scene text extraction problems. The word linking task, in particular, is quite challenging as words can be highly spaced with complicated text-like distractors, even other words appearing between the characters. Furthermore, words within a single location phrase may be divided across multiple lines to optimize label placement. The figure below illustrates primary challenges.

This 2025 edition is the continuation of the successful 2024 edition.
This edition introduces several new features:

  1. expanded data for the French Land Registers dataset
  2. addition of a new Taiwanese maps dataset from the GIS Center at Academia Sinica, featuring Chinese characters
  3. release of synthetic training data for each dataset
  4. improved evaluation metrics and indicators for better feedback

We also created a Google Group that you can join freely to receive competition update directly in you inbox: https://groups.google.com/g/icdar25-maptext-news

Looking to receiving you contributions,

— MapText'25 team

map_picture_11.jpgmap_picture_2.jpgmap_picture_3.jpg

Figure 1: Example images from the competition dataset. Left: Image can contain dense text regions and curved text labels. Middle: Text labels can have wide spacing with distractors between characters. Right: Phrases can be in multiple lines.

 

Task Overview

The competition encompasses four tasks, including 1) word detection, 2) phrase detection, 3) word detection and recognition, and 4) phrase detection and recognition. Detailed descriptions of each task can be found in the Tasks page.

 

 

  Task Name     Word Detection     Word Transcription     Word Linking  
  1: Word Detection   ✔️    
  2: Phrase Detection (Word Grouping)   ✔️   ✔️
  3: Word Detection and Recognition   ✔️ ✔️  
  4: Phrase Detection and Recognition   ✔️ ✔️ ✔️

 

References

[1] Chng, Chee Kheng, Yuliang Liu, Yipeng Sun, Chun Chet Ng, Canjie Luo, Zihan Ni, ChuanMing Fang et al. "ICDAR2019 robust reading challenge on arbitrary-shaped text-rrc-art." In 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1571-1576. IEEE, 2019.

[2] Yu, Wenwen, Mingyu Liu, Mingrui Chen, Ning Lu, Yinlong Wen, Yuliang Liu, Dimosthenis Karatzas, and Xiang Bai. "ICDAR 2023 Competition on Reading the Seal Title." arXiv preprint arXiv:2304.11966 (2023).

Challenge News

Important Dates

2024-12-10: Website live, train/val sets available

2025-03-01: Test set available, submissions are open

2024-04-01: Final submission deadline, including short reports