Overview - ICDAR 2024 Competition on Historical Map Text Detection, Recognition, and Linking

Please check our Zenodo community and GitHub Organization for datasets, code, and extra artifacts.

Description

Text on digitized historical maps contains valuable information providing georeferenced political and cultural context, yet the wealth of information in digitized historical maps remains largely inaccessible due to their unsearchable raster format. This competition aims to address the unique challenges of detecting and recognizing textual information (e.g., place names) and linking words to form location phrases.

While the detection and recognition tasks share similarities with the long line of prior robust reading competitions [1,2], historical map text extraction presents challenges such as dense text regions, rotated and curved text and widely spaced characters which are not very common in scene text extraction problems. The word linking task, in particular, is quite challenging as words can be highly spaced with complicated text-like distractors, even other words appearing between the characters. Furthermore, words within a single location phrase may be divided across multiple lines to optimize label placement. The figure below illustrates primary challenges.

 

map_picture_11.jpgmap_picture_2.jpgmap_picture_3.jpg

Figure 1: Example images from the competition dataset. Left: Image can contain dense text regions and curved text labels. Middle: Text labels can have wide spacing with distractors between characters. Right: Phrases can be in multiple lines.

 

Task Overview

The competition encompasses four tasks, including 1) word detection, 2) phrase detection, 3) word detection and recognition, and 4) phrase detection and recognition. Detailed descriptions of each task can be found in the Tasks page.

 

 

  Task Name     Word Detection     Word Transcription     Word Linking  
  1: Word Detection      
  2: Phrase Detection (Word Grouping)    
  3: Word Detection and Recognition    
  4: Phrase Detection and Recognition  

 

References

[1] Chng, Chee Kheng, Yuliang Liu, Yipeng Sun, Chun Chet Ng, Canjie Luo, Zihan Ni, ChuanMing Fang et al. "ICDAR2019 robust reading challenge on arbitrary-shaped text-rrc-art." In 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1571-1576. IEEE, 2019.

[2] Yu, Wenwen, Mingyu Liu, Mingrui Chen, Ning Lu, Yinlong Wen, Yuliang Liu, Dimosthenis Karatzas, and Xiang Bai. "ICDAR 2023 Competition on Reading the Seal Title." arXiv preprint arXiv:2304.11966 (2023).

Important Dates

2 January 2024: Competition Announced

1 February 2024: Training and validation data released

1 March 2024: Competition test data released

6 May 2024 [Extended] Final results submission deadline (AoE time zone)