- Task 1 - Key Information Localization and Extraction - Method: YOLOv8X+Grid
- Method info
- Samples list
- Per sample details
method: YOLOv8X+Grid2023-05-08
Authors: Jakub Straka
Affiliation: University of West Bohemia, Department of Cybernetics
Description: KILE task may be solved in many different ways. We chose to approach this task as object detection. This means that we treated each field in the document as an object. As the detection model was used YOLOv8X. The model is based on the convolutional neural network. One of the advantages of this model is its speed and small size. We also incorporated methods used in [1].
1. Anoop Raveendra Katti, Christian Reisswig, Cordula Guder, Sebastian Brarda,
Steffen Bickel, Johannes Höhne, and Jean Baptiste Faddoul. Chargrid: Towards
understanding 2d documents. arXiv preprint arXiv:1809.08799, 2018