Overview - ICDAR 2017 Robust Reading Challenge on Omnidirectional Video

This challenge focuses on scene text localization and recognition on the Downtown Osaka Scene Text (DOST) dataset [1]. Five tasks will be opened within this Challenge: Text Localisation in Videos, Text Localisation in Still Images, Cropped Word Recognition, End-to-End Recognition in Videos, and End-toEnd Recognition in Still Images. See details in the Tasks page.

The DOST dataset preserves scene texts observed in the real environment as they were. The dataset contains videos (sequential images) captured in shopping streets in downtown Osaka with an omnidirectional camera. Use of the omnidirectional camera contributes to excluding user’s intention in capturing images. Sequential images contained in the dataset contribute to encouraging developing a new kind of text detection and recognition techniques that utilize temporal information. Another important feature of DOST dataset is that it contains non-Latin text. Since the images were captured in Japan, a lot of Japanese text is contained while it also contains adequate amount of Latin text. Because of these features of the dataset, we can say that the DOST dataset preserved scene texts in the wild.

Important Notice about Ground Truth Quality and Renovation

Before releasing the DOST dataset, we decided to improve (renovate) the ground truth (GT) quality of the dataset. Particularly, transcriptions and bounding boxes of GTs are improved and more non-readable text regions are covered as "Don't care" regions. However, the renovation took much much longer than we expected. Considering the hard deadline of the competition result submission, we need to release all the files even without renovation. The GT files will be updated when the renovation completed. When GT is updated, already submitted results will be automatically reevaluated on the new GT.

We also notice the GTs are not perfect. As you can imagine, improvement of the ground truths takes a lot of effort. Though we will try best, please understand the limitation. We are welcome your voluntary work. If you are interested in it, please contact Masakazu Iwamura.

Anyway, enjoy the DOST dataset!

References

[1] Iwamura, M., Matsuda, T., Morimoto, N., Sato, H., Ikeda, Y., Kise, K..: Downtown Osaka Scene Text Dataset. ECCV 2016 International Workshop of Robust Reading, pp.440-455 (2016)

Challenge News

Important Dates

April, 15: Initial training data available

May 27: More training data available

June 10: Test data available / Submissions open

June, 30: Submission of results deadline.

November, 10-15: Results presentation.