Overview - ICDAR2017 Robust Reading Challenge on end-to-end recognition on the Google FSNS dataset
This challenge focuses on constrained real world end-to-end scene-text understanding, under multiple views based on the French Street Name Signs (FSNS) dataset . The Google FSNS dataset consists of more than one million images of street name signs cropped from Google Street View images of France. Each image contains four views of the same street name sign. Text in a street sign can span up to three lines. A single canonical transcription is associated with each street sign. More details about the dataset can be found in .
What are the key challenges?
- End-to-end multi-line scene text recognition
- Inference of the order of recognized words
- Visual recognition of canonical transcriptions
- Large-scale: 1M training samples
- Leveraging multiple views to improve recognition accuracy
- Train-phase beginning: Registration opens and resources needed for developing training and validadtion are made available:
- Train and validation data
- Shell and python scripts to assist manipulation of the data
- Performance evaluation method
- Modular baseline system for a simple end-to-end pipeline: we will release a baseline system with independent modules for word localization and word recognition. Participants will be allowed to use any of these baseline modules in their submission.
- Test phase beginning: participants should freeze all development/tuning/training of their systems. The sequestered test-set will be made available to participants.
- Final submission: participants must submit the results of their method on the sequestered test-set
- Competition results: the competition results will be made publicly available during the ICDAR 2017 Conference.