Overview - Born-Digital Images (Web and Email)
Images are frequently used in electronic documents (Web and email) to embed textual information. The use of images as text carriers stems from a number of needs. For example images are used in order to beautify (e.g. titles, headings etc), to attract attention (e.g. advertisements), to hide information (e.g. images in spam emails used to avoid text-based filtering), even to tell a human apart from a computer (CAPTCHA tests).
Automatically extracting text from born-digital images is therefore an interesting prospect as it would provide the enabling technology for a number of applications such as improved indexing and retrieval of Web content, enhanced content accessibility, content filtering (e.g. advertisements or spam emails) etc.
While born-digital text images are on the surface very similar to real scene text images (both feature text in complex colour settings) at the same time they are distinctly different. Born-digital images are inherently low-resolution (made to be transmitted online and displayed on a screen) and text is digitally created on the image; scene text images on the other hand are high-resolution camera captured ones. While born-digital images might suffer from compression artefacts and severe anti-aliasing they do not share the illumination and geometrical problems of real-scene images. Therefore it is not necessarily true that methods developed for one domain would work in the other.
In 2011 we set out to find out the state of the art in Text Extraction in both domains (born-digital images and real scene). We received 24 submissions over three different tasks in the born-digital Challenge, 10 during the competition run and 14 more over the following year, after the competition was opened in a continuous mode in October 2011.
Given the strong interest displayed by the community, and the fact that there is still a large margin for improvement, in the ICDAR 2013 edition we revisited the tasks of localisation, segmentation and recognition and invited further submissions on an updated and even more challenging dataset. We received 13 submissions during the 2013 edition and the year following it, when the competition was opened in a continuous mode.
For the 2015 edition, we are introducing a new task: End-to-End, referring to text localisation and recognition in a single go at the word level. The rest of the tasks remain open in a continuous mode, unchanged form the 2013 edition. See details in the Tasks page.
The results from the past ICDAR competitions can be found in the ICDAR proceedings [1, 2].
D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. Gomez, S. Robles, J. Mas, D. Fernandez, J. Almazan, L.P. de las Heras , "ICDAR 2013 Robust Reading Competition", In Proc. 12th International Conference of Document Analysis and Recognition, 2013, IEEE CPS, pp. 1115-1124. [pdf] [poster] [presentation]
- D. Karatzas, S. Robles Mestre, J. Mas, F. Nourbakhsh, P. Pratim Roy , "ICDAR 2011 Robust Reading Competition - Challenge 1: Reading Text in Born-Digital Images (Web and Email)", In Proc. 11th International Conference of Document Analysis and Recognition, 2011, IEEE CPS, pp. 1485-1490. [pdf] [presentation]