Method: Naver Labs - Task 1 - End to End Recoginition - Information Extraction in Historical Handwritten Records

method: Naver Labs2018-06-25

Authors: Animesh Prasad, Hervé Déjean, Jean-Luc Meunier, Max Weidemann, Johannes Michael, Gundram Leifert

Description: For this task we use a pipeline approach where first the line image is preprocessed and then passed through a CNN-BLSTM architecture with CTC loss (i.e. HTR). Then in next step, we use a BLSTM over the feature layer (computed as all character n-gram for the tokens generated from best effort decoding of HTR output) trained using cross entropy loss to maximize the accuracy.

@article{prasad2018bench, title={Bench-Marking Information Extraction in Semi-Structured Historical Handwritten Records}, author={Animesh Prasad, Herv\'e D\'ejean, Jean-Luc Meunier, Max Weidemann, Johannes Michael, Gundram Leifert}, journal={arXiv preprint arXiv:1807.06270}, year={2018} }

Source code