Overview - ICDAR 2024 Competition on Handwriting Recognition of Historical Ciphers

Handwritten Text Recognition (HTR) in low resource scenarios (i.e. when the amount of labeled data is scarce) is a challenging problem. This is particularly the case of historical encrypted manuscripts, so called ciphers, which contain secret messages, and were typically used in military or diplomatic correspondence, records of secret societies, or private letters. In order to hide their contents, the sender and receiver created their own secret method of writing. The cipher alphabets oftentimes include digits, Latin or Greek letters, Zodiac and alchemical signs combined with various diacritics, as well as invented symbols.

im_ciphers.png

The first step in the decryption process is the transcription of these manuscripts, which is not easy due to the high variation of hand-writing styles and cipher alphabets, and in addition, the often few number of pages. Although different strategies can be considered to deal with the insufficient amount of training data (e.g. few-shot learning, self-supervised learning) the performance of HTR models is still far from satisfactory. Thus, we believe that a competition with a large number of symbol sets and scribes can boost the research of HTR in low resource scenarios. Indeed, the recognition of ciphers is an example of such a low-resource scenario with a high historical interest. Thousands of enciphered historical manuscripts are buried in libraries and archives. So, transcribing and decrypting the information contained in these special sources is important to the understanding of our cultural heritage, since it helps to shed new light on and even to (re-)interpret our history.

 

References

 

Challenge News

Important Dates

10 January 2024: Competition Announced

20 January 2024: Training data released

18 February 2024: Test data released

10 May 2024: Submission of results