- Task 1 - Text Localization
- Task 2 - Single Image End-to-End Recognition
- Task 3 - Multi Image End-to-End Recognition
method: OCCDet2024-05-16
Authors: Yuchen Su, Yongkun Du, Zhaolong Pan, Bin Zhu, Shuai Zhao, Yin Liu, Xuesheng Wang, Yiming Lei, Xingsong Ye, Zhineng Chen
Affiliation: Fudan University
Email: ycsu23@m.fudan.edu.cn
Description: We first utilize LRANet to predict coarse text detection results. Subsequently, these results are refined using six transformer decoder layers. The backbone is ViT-L, incorporating copy-paste data augmentation and a multi-scale testing strategy.
method: OCCDet22024-05-16
Authors: Yuchen Su, Yongkun Du, Zhaolong Pan, Bin Zhu, Shuai Zhao, Yin Liu, Xuesheng Wang, Yiming Lei, Xingsong Ye, Zhineng Chen
Affiliation: Fudan University
Description: We first utilize LRANet to predict coarse text detection results. Subsequently, these results are refined using six transformer decoder layers. The backbone is ViT-L, incorporating copy-paste data augmentation and a multi-scale testing strategy.
method: OCCDet2024-05-16
Authors: Yuchen Su, Yongkun Du, Zhaolong Pan, Bin Zhu, Shuai Zhao, Yin Liu, Xuesheng Wang, Yiming Lei, Xingsong Ye, Zhineng Chen
Description: We first utilize LRANet to predict coarse text detection results. Subsequently, these results are refined using six transformer decoder layers. The backbone is ViT-L, incorporating copy-paste data augmentation and a multi-scale testing strategy.
Occluded | General | Occluded Subcategory Recall | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Date | Method | F-score | Recall | Precision | Recall | F-score | Occluded Visible | Occluded Inferable | Occluded Indeterminate | |||
2024-05-16 | OCCDet | 73.75% | 75.49% | 72.09% | 64.53% | 68.10% | 76.23% | 83.71% | 74.41% | |||
2024-05-16 | OCCDet2 | 70.67% | 77.29% | 65.10% | 67.49% | 66.27% | 77.78% | 84.47% | 76.38% | |||
2024-05-16 | OCCDet | 70.67% | 77.29% | 65.10% | 67.49% | 66.27% | 77.78% | 84.47% | 76.38% | |||
2024-05-16 | OCCDet | 69.54% | 77.70% | 62.93% | 67.96% | 65.35% | 77.78% | 84.47% | 76.90% | |||
2024-05-15 | det | 66.38% | 64.66% | 68.20% | 54.27% | 60.44% | 66.67% | 73.11% | 63.34% | |||
2024-06-07 | Method | 41.27% | 42.12% | 40.46% | 47.40% | 43.66% | 42.89% | 49.62% | 41.12% | |||
2024-05-09 | Multi-View Fusion Network for Text Spotting | 4.00% | 3.61% | 4.49% | 4.68% | 4.58% | 4.13% | 4.17% | 3.46% |