Authors: Njoyim Tchoubith Peguy Calusha

Affiliation: University of Fribourg, Switzerland

Email: pegpeg07@hotmail.com

Description: Here is a simple neural language model (NLM) that relies only on character-level inputs. This model employs a convolutional neural network (CNN) and a highway network over characters, whose output is given to a bidirectional long short-term memory (BLSTM) recurrent neural network language model (RNN-LM).

Unlike previous works that utilize subword information via morphemes, this model does not require morphological tagging as a pre-processing step. And, unlike the recent line of work which combines input word embeddings with features from a character-level model, this model does not utilize word embeddings at all in the input layer. Given that most of the parameters in NLMs are from the word embeddings, the proposed model has significantly fewer parameters than previous NLMs, making it attractive for applications where model size may be an issue (e.g. cell phones).

To adapt this model to the scanned receipts, the following modifications has been made:

- Unlike the original predictions made at word-level, the predictions are made at entity-level.
- The two LSTM layers are bidirectional (BiLSTM).
- A batch norm layer is added before the highway layer(s).
- The initialization of parameters is different for BiLSTM, and it is based on this paper: https://arxiv.org/pdf/1702.00071.pdf.

Using the website evaluation procedure, the OCR mismatches are removed and the discrepancies of total amount randomly prefixed by "RM" are fixed for fair comparison results with other participants.

Authors: Njoyim Tchoubith Peguy Calusha

Affiliation: University of Fribourg, Switzerland

Email: pegpeg07@hotmail.com

Description: Here is a simple neural language model (NLM) that relies only on character-level inputs. This model employs a convolutional neural network (CNN) and a highway network over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM).

Unlike previous works that utilize subword information via morphemes, this model does not require morphological tagging as a pre-processing step. And, unlike the recent line of work which combines input word embeddings with features from a character-level model, this model does not utilize word embeddings at all in the input layer. Given that most of the parameters in NLMs are from the word embeddings, the proposed model has significantly fewer parameters than previous NLMs, making it attractive for applications where model size may be an issue (e.g. cell phones).

To adapt this model to the scanned receipts, the following modifications has been made:

- Unlike the original predictions made at word-level, the predictions was made at text-line level
- The two LSTM layers are bidirectional.
- A batch norm layer is added before the highway layer(s).
- The initialization of parameters is different for BiLSTM, and it is based on this paper: https://arxiv.org/pdf/1702.00071.pdf.

Using the website evaluation procedure, the OCR mismatches are removed and the discrepancies of total amount randomly prefixed by "RM" are fixed for fair comparison results with other participants.

method: PICK-PAPCIC & XZMU2020-04-15

Authors: Wenwen Yu*, Ning Lu*, Xianbiao Qi, Rong Xiao

Affiliation: PAPCIC & XZMU

Description: We propose PICK, a framework that is effective and robust in handling complex documents layout for key information extraction by combining graph learning with graph convolution operation, yielding a richer semantic representation containing the textual and visual features and global layout without ambiguity. For the output of the model, we designed task-specific rules to constrain the final results.

Ranking Table

Description Paper Source Code
DateMethodRecallPrecisionHmean
2022-04-15Character-Aware CNN + Highway + BiLSTM 2.098.20%98.48%98.34%
2021-10-25Character-Aware CNN + Highway + BiLSTM 1.096.18%97.45%96.81%
2020-04-15PICK-PAPCIC & XZMU95.46%96.79%96.12%
2019-05-02CLOVA OCR89.05%89.05%89.05%
2020-12-28Custom Named Entity Recognition77.59%77.59%77.59%
2019-04-28A Simple Method for Key Information Extraction as Character-wise Classification with LSTM75.58%75.58%75.58%
2019-05-05Location-aware BERT model for Text Information Extraction74.42%74.42%74.42%
2019-05-02With receipt framing63.04%63.54%63.29%
2022-02-01Pytesseract + Character Level LSTM + Regex for Dates26.95%26.95%26.95%

Ranking Graphic