Method: MSER MRF - Task 4 - End-to-End - Focused Scene Text

method: MSER MRF2015-04-02

Authors: Xiaolong Liu

Description: The method used to tackle the end-to-end task of challenge is related with two paper. The title of the first paper is “Multi-script Text Extraction from Natural Scenes” published in ICDAR 2013. The authors firstly detect Maximally Stable Extremal Regions (MSER), and train an adaboost classifier to filter out the regions that are impossible two be characters. The reserved MSERs are clustered and then another adaboost classifier is used to score each cluster. Only the clusters whose scores exceed a threshold are kept and thought as text. The author of the second paper is me. This paper has been submitted to ICDAR 2015, and its title is “Natural Scene Character Recognition using Markov Random Field”. Given a born-digital template for a character, I sample dozen of points on its contour and set some sparse edges to form a undirected graph. For a test character, each built undirected graph is projected to it. The two order energy functions are defined and attached to different vertexes and edges in undirected graphs, thus forming Markov Random Feilds for different undirected graph. The best projection from a undirected graph to the test character is found by minimizing the energy of corresponding MRF. Among the best projections of different undirected graphs, I perform a affine calibration for the minimum energy two give projection scores, and then make decision based on the scores and an analysis of character regions specified by the projection results.
My method consists of three steps: 1) text detection. 2) character recognition. 3) word recognition. The first and the second step are implemented based on the two paper mentioned above respectively. The word recognition is accomplished by finding the word in the per-image vocabulary which has the nearest editing distance to the grouped recognised characters. The vocabularies is provided by the organizers of ICDAR 2015 Competitions.