method: Megvii-Image++2016-04-13
Authors: Jia Yu, Xinyu Zhou, Cong Yao, Jianan Wu, Chi Zhang, Shuchang Zhou
Description: The detection part is accomplished by a FCN which directly extracts text regions from original images. The tracker is a net flow based association algorithm. The recognition part is another neural network that performs whole word recognition.
method: Baseline-TextSpotter2015-03-30
Authors: Lukas Neumann, Jiri Matas, Michal Busta
Description: TextSpotter is used for frame-by-frame detection. The FoT tracker of Tomas Vojir et al is used for tracking.
TextSpotter is an unconstrained real-time end-to-end text localization and recognition method. The real-time performance is achieved by posing the character detection problem as an efficient sequential selection from the set of Extremal Regions (ERs). ERs are grouped into word regions which are recognized using an approximate nearest-neighbor classifier operating on a coarse Gaussian scale-space pyramid. A demo of the software is available online: http://www.textspotter.org
The FoT tracker [1] can be found here:
http://cmp.felk.cvut.cz/~vojirtom/
[1] Tomas Vojir and Jiri Matas, “The Enhanced Flock of Trackers“. Registration and Recognition in Images and Videos - Studies in Computational Intelligence, Springer 2014.
method: Stradvision-12015-04-17
Authors: H. Cho, M. Sung, and B. Jun
Description: First, we extract character candidates using extremal regions (ER) Second, we verify the extracted character candidates with the character classifier trained by Agile Learning. Afterwards, we do text-patch matching which greatly enhances the recall rate, and group the characters into text regions. Finally, we apply a deep neural network for character recognition. For tracking the text regions, we combined "detection by tracking" and "tracking by detection".
Date | Method | MOTA | MOTP | IDF1 | Mostly Matched | Partially Matched | Mostly Lost | |||
---|---|---|---|---|---|---|---|---|---|---|
2016-04-13 | Megvii-Image++ | 61.21% | 64.95% | 0.00% | ||||||
2015-03-30 | Baseline-TextSpotter | 59.83% | 69.51% | 0.00% | ||||||
2015-04-17 | Stradvision-1 | 56.54% | 69.21% | 0.00% | ||||||
2015-04-02 | USTB_TexVideo II-2 | 50.52% | 63.48% | 0.00% | ||||||
2015-04-02 | USTB_TexVideo | 45.82% | 65.08% | 0.00% | ||||||
2015-04-02 | Deep2Text I (Video) | 35.39% | 62.12% | 0.00% | ||||||
2015-04-02 | USTB_TexVideo II-1 | 21.16% | 60.46% | 0.00% |