Overview - ICDAR 2019 Robust Reading Challenge on Multi-lingual scene text detection and recognition
RRC-MLT-2019 Call for Participation: download |
Text detection and recognition in a natural environment is a key component of many applications, ranging from business card digitization to shop indexation in a street. This new competition aims at assessing the ability of state of the art methods to detect and recognize multi-lingual text. This situation is encountered in modern cities where multiple cultures live and communicate together, where users see various scripts and languages in a way which prevent using much a priori knowledge. Multi-lingual text also poses a problem when analyzing streams of contents gathered on the Internet.
Registration
To register in this MLT-challenge of the RRC competition 2019, please do:
1) Register to the RRC portal as a user (if you are not already a registered user), this will allow you to access the "downloads"
2) Send an email to n[dot]nayef[at]gmail[dot]com with the title "Participation in the RRC-MLT-2019 challenge"
This registration process does not oblige you to participate or submit results, it is an expression of interest. You can participate in one or more tasks of the challenge. It is not obligatory to participate in all the tasks.
Motivation and relevance to ICDAR community
In this proposed competition we try to answer the question whether text detection and recognition methods (whether deep learning-based or otherwise) could handle different scripts/languages without fundamental changes in the used algorithms/techniques, or do we really need script-specific methods ?. The ultimate goal of robust reading is be able to read the text which appears in any captured image despite image source (type), image quality, text script or any other difficulties. Many research works have been devoted to solve this problem. The previous editions of RRC competitions and other works, have provided useful datasets to help researchers tackle each of those problems in order to robustly read text in natural scene images. In this competition, we extend state-of-the-art work further by tackling the problem of multi-lingual text detection, recognition and script identification. In other words, methods should be script-robust.
Despite the available datasets related to scene text detection or to script identification, our proposed dataset offers interesting novel aspects. The dataset is composed of complete scene images which come from 10 languages representing 7 different scripts. It combines text detection and recognition with script identification, and contains much more images than related datasets. The number of images per script is equal. This makes it a useful benchmark for the task of multi-lingual scene text detection. The considered languages are the following: Chinese, Japanese, Korean, English, French, Arabic, Italian, German, Bangla and Hindi.
Such dataset is the natural extension of the RRC series, with more scripts and more images while only focusing on intentional (or focused) text. It addresses the needs of the community for improved and robust scene text detection. The target audience of this dataset is obviously not only the ICDAR community, but also the computer vision community. In both communities, researchers work on analyzing scenes, scene text detection and recognition, quality of text images and script identification.
The datasets available in the literature for scene text detection are mostly not multilingual. The datasets which contain multi-script text are either built for Indian scripts only, or they contain a small number of scripts (2 - 4) with a relatively small number of images. Moreover, datasets that have been created for the tasks of script identification (classification) are composed of cropped text word images.
Challenge News
- 01/10/2019
New Challenges for 2019 Announced - 11/11/2018
Special Issue on Scene Text Reading and its Applications - 03/26/2018
Do NOT use qq.com emails to register or contact us - 03/22/2018
Downtime due to scheduled revisions on 26 and 27 March 2018 - 04/03/2017
Downtime due to scheduled revision on 11 and 12 April 2017
Important Dates
15 Feb to 2 May
Manifestation of interest by participants opens
Asking/Answering questions about the details of the competition
1 Mar
Competition formal announcement
11 Mar
Website fully ready
Registration of participants continues
Evaluation protocol, file formats etc. available
11 Mar to 2 May
Train set available - training period - MLT challenge in progress -Participants evaluate their methods on the training/validation sets - Prepare for submission
Registration is still open
2 May
Registration closes for this MLT challenge for ICDAR-2019
2 May to 1 June
Test set available
1 June
Deadline for submission of results by participants
20 - 25 Sept
Announcement of results at ICDAR2019
1 Oct
The public release of the full dataset