method: NSTextractor2013-04-08

Authors: Sergey Milyaev, Olga Barinova, Tatiana Novikova, Pushmeet Kohli, Victor Lempitsky

Description: Our image binarization method incorporates local cues, such as local binarization map and image Laplacian, into a global optimization framework. Each connected component generated from binarization map is then passed to a feature extraction module and classified based on its geometric shape features. Then we filter out the components that are classified as background. At the next step connected components that passed the filter are grouped into textline-candidates based on similarity of their size and color. For each textline we average the outputs of the classifier, perform non-maximum suppression according to the average scores and filter out textlines with low average scores. Thresholds are selected to obtain large number of well segmented characters, while keeping the number of background segment low.