method: E2E-MLT2019-05-22

Authors: Yash Patel, Michal Busta, Jiri Matas

Description: An end-to-end trainable (fully differentiable) method for multi-language scene text localization and recognition is proposed. The approach is based on a single fully convolutional network (FCN) with shared layers for both tasks.
E2E-MLT is the first published multi-language OCR for scene text. While trained in multi-language setup, E2E-MLT demonstrates competitive performance when compared to other methods trained for English scene text alone. The experiments show that obtaining accurate multi-language multi-script annotations is a challenging problem.

Ranking Table

Description Paper Source Code
DateMethodHmeanPrecisionRecallAverage Precision1-NED1-NED (Case Sens.)Hmean (Case Sens.)
2019-05-22E2E-MLT26.46%37.44%20.47%7.72%26.39%25.71%24.85%

Ranking Graphic

Ranking Graphic