Overview - Document Information Localization and Extraction

DocILE is a large-scale research benchmark for cross-evaluation of machine learning methods for Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR) from semi-structured business documents such as invoices, orders etc. Such large-scale benchmark was previously missing (Skalický et al., 2022), hindering comparative evaluation.

For additional information, we refer participants to https://docile.rossum.ai.

Competition and Prizes

The DocILE'23 competition is running as a CLEF 2023 lab and an ICDAR 2023 competition with a single leaderboard. The deadline for submissions is on May 10, 2023. The competition comes with a prize pool of $9000:

  • Top three eligible teams on the KILE leaderboard will receive $2000, $1000 and $500 respectively.
  • Top three eligible teams on the LIR leaderboard will receive $2000, $1000 and $500 respectively.
  • A $2000 best-paper award, selected by the lab organizers and the steering committee.

In order to participate in the competition (and be eligible for prizes), you need to follow the competition rules. Most notably:

  • You need to register using the CLEF Labs Registration Form.
  • It is prohibited to use external document datasets and models trained on these datasets.

Important Dates

Test set published and submissions open: 24 Apr 2023

Register by: 28 Apr 2023 (at http://clef2023-labs-registration.dei.unipd.it/)

Benchmark submission deadline: 10 May 2023

Working note submission deadline: 5 June 2023

Notification of working note acceptance: 23 June 2023


Note: DocILE 2023 runs simultaneously as a CLEF 2023 Lab and follows CLEF 2023 schedule: https://clef2023.clef-initiative.eu/index.php?page=Pages/schedule.html

Working note papers: On top of providing method description for the submission on RRC website, working note must be submitted to organizers as explained in the competition rules: https://docile.rossum.ai/static/docile_rules_and_prize_eligibility.pd