Overview - RoadText Competition on Video Text Detection, Tracking and Recognition

The RoadText challenge is a competition that aims to advance the current state of the art in scene text detection, recognition, and tracking in videos. This is a particularly challenging task due to the unique characteristics of text in driving videos. Unlike text in other types of videos, the text in these videos is often incidental and widely dispersed across the scene. Additionally, the camera movement in these videos can introduce distortions such as motion blur, which can make the text difficult to recognize.

To evaluate and improve methods for this task, the RoadText challenge will be based on the RoadText-1K[1] dataset, which contains 1000 dash cam videos. Each video is 10 seconds long and has 30 frames per second. In these videos, the text object lifetimes are typically quite short, which means that models need to be able to handle occlusions and deal with tiny, distorted text instances that are frequently influenced by motion blur and significant perspective distortions. In many cases, text instances may not be fully readable in any single frame, requiring the combination of detections from multiple frames to successfully transcribe them.

Overall, the RoadText challenge focuses on detecting, tracking, and recognizing text instances in videos, with an emphasis on developing models that are able to handle the unique challenges presented by text in driving videos. By addressing these challenges, the competition hopes to contribute to the development of technology that can assist with a variety of tasks, including automatic translation of road signs, improved navigation for self-driving vehicles, and more.

Figure 1: Example frames from clips in RoadText-1K with text location and
transcription annotations overlaid. Boxes in green correspond to English text,
blue represent non English and red represent illegible text.


[1] Sangeeth Reddy, Minesh Mathew, Lluis Gomez, Marcal Rusinol, Dimosthenis Karatzas., and C. V. Jawahar. Roadtext-1k: Text detection & recognition dataset for driving videos, 2020.

Challenge News

Important Dates

24th -31st December 2022: Initial website launch

24th -31st December 2022: Initial training data release

15 February 2023: Full training data along with test data release

1st March 2023: Submission site open

20 March 2023: Deadline for competition submissions

27 March 2023: Extended deadline for competition submissions

1 May 2023: Initial submission of competition report

21 - 26 August 2023: Result announcement and presentation


All deadlines are in the AoE time zone