Results - Information Extraction in Historical Handwritten Records

method: Naver Labs2018-06-25

Authors: Animesh Prasad, Hervé Déjean, Jean-Luc Meunier, Max Weidemann, Johannes Michael, Gundram Leifert

Description: For this task we use a pipeline approach where first the line image is preprocessed and then passed through a CNN-BLSTM architecture with CTC loss (i.e. HTR). Then in next step, we use a BLSTM over the feature layer (computed as all character n-gram for the tokens generated from best effort decoding of HTR output) trained using cross entropy loss to maximize the accuracy.

@article{prasad2018bench, title={Bench-Marking Information Extraction in Semi-Structured Historical Handwritten Records}, author={Animesh Prasad, Herv\'e D\'ejean, Jean-Luc Meunier, Max Weidemann, Johannes Michael, Gundram Leifert}, journal={arXiv preprint arXiv:1807.06270}, year={2018} }

Source code

method: Joint HTR + NER no postprocessing2018-10-27

Authors: Manuel Carbonell, Mauricio Villegas, Alicia Fornés, Josep Lladós

Description: Given input lines we feed them into a CRNN model and jointly predict the transcription, named entities and person tags, combining them into an extended alphabet, predicting at each time step either a transcription symbol or the tag of the upcoming word.

@article{DBLP:journals/corr/abs-1803-06252, author = {Manuel Carbonell and Mauricio Villegas and Alicia Forn{\'{e}}s and Josep Llad{\'{o}}s}, title = {Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model}, journal = {CoRR}, volume = {abs/1803.06252}, year = {2018}, url = {http://arxiv.org/abs/1803.06252}, archivePrefix = {arXiv}, eprint = {1803.06252}, timestamp = {Mon, 13 Aug 2018 16:48:56 +0200}, biburl = {https://dblp.org/rec/bib/journals/corr/abs-1803-06252}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Ranking Table

Description Paper Source Code

Date				Method	Basic Score	Complete Score	Name	Surname	Location	Occupation	State	Input Type
2018-06-25				Naver Labs	95.46%	95.03%	97.01%	92.73%	95.03%	96.43%	96.41%	LINE
2018-10-27				Joint HTR + NER no postprocessing	90.59%	89.40%	89.94%	84.07%	90.71%	92.10%	96.59%	LINE

Inactive evaluations

method: Naver Labs2018-06-25

method: Joint HTR + NER no postprocessing2018-10-27

Ranking Table

Ranking Graphic

Ranking Graphic