method: TransDETR2023-03-26

Authors: Yu Hao, Chuhui Xue, Wenqing Zhang, Song Bai

Affiliation: ByteDance Inc.

Email: jinyu121@gmail.com

Description: The method we use is TransDETR[1]. First, we get the weights pre-trained on the ICDAR2015 video, then use the RoadText3K and BOVText to fine-tune the network for 20 epochs. Finally, we use the RoadText to fine-tune the network for 20 epoch.

[1] End-to-end Video Text Spotting with Transformer
[2] Read while you drive - multilingual text tracking on the road