method: TransDETR2023-03-26
Authors: Yu Hao, Chuhui Xue, Wenqing Zhang, Song Bai
Affiliation: ByteDance Inc.
Email: jinyu121@gmail.com
Description: The method we use is TransDETR[1]. First, we get the weights pre-trained on the ICDAR2015 video, then use the RoadText3K and BOVText to fine-tune the network for 20 epochs. Finally, we use the RoadText to fine-tune the network for 20 epoch.
[1] End-to-end Video Text Spotting with Transformer
[2] Read while you drive - multilingual text tracking on the road