method: CLOVA OCR2019-05-02

Authors: Sungrae Park, Seonghyeon Kim, Seung Shin, Jaeheung Surh, Junyeop Lee, Hwalsuk Lee

Description: A sequence tagging approach classifying tokens in all text boxes.
We used the pre-trained BERT model and finetuned the model to classify all tokens into 5 classes such as "none", "company", "address", "date", and "total". In order to feed a single text sequence, we identified lines in a receipt, sorted text boxes at each line, and merged them line by line.

Authors: Parth Chawla

Affiliation: BMS Institute of Technology

Description: Custom trained a blank English language model over reconstructed strings from optical character recognition task data using spaCy.

Authors: Songyi Yang, Shengjie Xiu, Niansong Zhang

Description: This is a method that tackles the key information extraction problem as a character-wise classification problem with a simple stacked bidirectional LSTM. The method first formats the text from an image into a single sequence. The sequence is then fed into a two-layer bidirectional LSTM to produce a classification label from 5 classes - 4 key information category and one "others" - for each character. The method is simple enough with just a two-layer bidirectional LSTM implemented in PyTorch, and proves to sufficient in understanding the context of a receipt text and outputting highly accurate results.

Ranking Table

Description Paper Source Code
DateMethodRecallPrecisionHmean
2019-05-02CLOVA OCR89.05%89.05%89.05%
2020-12-28Custom Named Entity Recognition77.59%77.59%77.59%
2019-04-28A Simple Method for Key Information Extraction as Character-wise Classification with LSTM75.58%75.58%75.58%
2019-05-05Location-aware BERT model for Text Information Extraction74.42%74.42%74.42%
2019-05-02With receipt framing63.04%63.54%63.29%

Ranking Graphic