TA-VTT | 52.30% | 77.45% | 69.64% | 21 | 11 | 10 | 396 | 6 | 43 | 273 |
SVRepV2(Kuaishou-MMU) | 51.85% | 73.60% | 72.68% | 21 | 12 | 9 | 443 | 8 | 93 | 224 |
LOGO | 47.56% | 76.34% | 67.36% | 17 | 13 | 12 | 394 | 13 | 73 | 268 |
h&h_lab | 47.11% | 76.25% | 68.43% | 17 | 16 | 9 | 404 | 7 | 86 | 264 |
Semantic-Aware Video Text Detection | 44.74% | 79.12% | 58.20% | 19 | 9 | 14 | 350 | 6 | 48 | 319 |
GoMatching++ | 42.81% | 78.57% | 62.14% | 22 | 3 | 17 | 376 | 11 | 87 | 288 |
HIK_OCR | 38.22% | 77.85% | 58.89% | 14 | 6 | 22 | 304 | 4 | 46 | 367 |
TransDETR | 37.63% | 75.59% | 63.69% | 21 | 8 | 13 | 412 | 8 | 158 | 255 |
TransVTSpotter | 35.70% | 77.93% | 56.79% | 12 | 19 | 11 | 340 | 13 | 99 | 322 |
GoMatching | 35.26% | 79.30% | 57.72% | 18 | 2 | 22 | 323 | 5 | 85 | 347 |
VideoTextSCM | 31.56% | 78.39% | 51.67% | 19 | 9 | 14 | 281 | 25 | 68 | 369 |
GOCR | 26.07% | 76.63% | 54.58% | 19 | 5 | 18 | 327 | 16 | 151 | 332 |
Megvii-Image++ | 25.78% | 67.16% | 44.88% | 4 | 7 | 31 | 208 | 1 | 34 | 466 |
GOCR Offline | 24.59% | 76.97% | 52.38% | 15 | 3 | 24 | 297 | 12 | 131 | 366 |
USTB_TexVideo II-2 | 16.74% | 73.78% | 27.89% | 3 | 11 | 28 | 137 | 3 | 24 | 535 |
SRC-B-TextProcessingLab | 16.59% | 67.78% | 33.37% | 6 | 6 | 30 | 153 | 0 | 41 | 522 |
USTB_TexVideo | 13.63% | 73.41% | 26.65% | 3 | 6 | 33 | 132 | 1 | 40 | 542 |
AJOU | 8.89% | 77.07% | 20.07% | 1 | 2 | 39 | 99 | 4 | 39 | 572 |
USTB_TexVideo II-1 | 7.41% | 71.83% | 15.54% | 1 | 1 | 40 | 73 | 1 | 23 | 601 |
RTST Lucas-Kanade-2 (RealTimeSceneText_LucasKanade_v2) | -147.26% | 76.78% | 8.04% | 3 | 4 | 35 | 73 | 51 | 1067 | 551 |
StradVision-1 | -4.59% | 69.25% | 19.55% | 3 | 6 | 33 | 117 | 32 | 148 | 526 |