Method: TH-TextLoc - Task 1 - Text Localization - Focused Scene Text

method: TH-TextLoc2013-04-08

Authors: Cheng Yang, Changsong Liu, Xiaoqing Ding

Description: TH-TextLoc (method name)

Further to the system presented in ICDAR2011 Robust Reading Competitions, we have improved the algorithms in several ways. We adopt a multi-scale strategy to extract more text candidates and select candidates more accurately using Conditional Random Field (CRF) model analysis.

The proposed approach extracts all the text candidates using connected components analysis method in multi-scale manner. In each scale, the text candidates' components were extracted using adaptive local binarization with given window’s size in gray level. Then a coarse-to-fine theme was employed to select text candidates. The apparent non-text components were discarded coarsely. In the fine stage, the Conditional Random Field (CRF) model was used to label the remaining components as text/non-text by integrating both the unary component property and neighboring component relationship. For the unary component' property, we applied a SVM classifier to calculate its probability to be characters using several CC's features like stroke width, aspect ratio, shape and etc. The neighboring components with similar height and color were linked into chains for CRF analysis. After components labeling using learned CRF, all the text CC''s chains were grouped to generate text region candidates. Then the projection profile and recognition result of the text region were analyzed to validate the text region hypotheses.

Finally, the text regions in different scales were merged. In addition, the text regions were separated into words using heuristic rules. The proposed method had achieved good performances both in scene and born-digital images.

Authors: Cheng Yang, Changsong Liu, Xiaoqing Ding.
Department of Electronic Engineering, Tsinghua University, Beijing, China.
yangcheng@ocrserv.ee.tsinghua.edu.cn