method: TH-TextLoc2013-04-08

Authors: Cheng Yang, Changsong Liu, Xiaoqing Ding.

Description: TH-TextLoc (method name)

In TH-TextLoc, we adopt a three-step region-based method to localize text in web images. Further to the previous system presented in ICDAR2011 Robust Reading Competitions, we have improved the algorithms in several ways. We adopt Conditional Random Field (CRF) model analysis for more accurately text candidates’ selection and extract text region using connected components (CC) linking method.

First, to correct web images' resolution and simplify the preprocessing, we just up-scaled all the source web images according to the images’ size using the bi-cubic interpolation algorithm.

All the connected components have been extracted using adaptive local binarization method. Then a coarse-to-fine theme was employed to select text candidates. The apparent non-text components will be discarded coarsely. In the fine stage, we use the Conditional Random Field (CRF) model to label the remaining components as text/non-text by integrating both the unary component property and neighboring component relationship. For the unary component' property, we apply a SVM classifier to calculate its probability to be characters using several CC's features like stroke width, aspect ratio, shape and etc. The neighboring components with similar height and color were linked into chains for CRF analysis. After components labeling using learned CRF, all the text CC''s chains were grouped to generate text region candidates.

Finally, the projection profile and recognition result of the text region were analyzed to validate the text region hypotheses. In addition, the text regions are separated into words using heuristic rules.

Authors: Cheng Yang, Changsong Liu, Xiaoqing Ding.
Department of Electronic Engineering, Tsinghua University, Beijing, China.
yangcheng@ocrserv.ee.tsinghua.edu.cn