method: MCEM v22019-04-30


Description: We propose a deep learning based scene text recognition approach. The text lines are fed to the scene text recognition network which is an ordinary attention based encoder decoder architecture. Our training set is the union of CTW, RCTW, RECTS, ART, LSVT datasets. All the training datasets have been publicly released. These training datasets are augmented to train a model that could work on perspectives. Our experiments show that two models complement each other properly.
name organization
Xiangxiang Wang(王翔翔) iFLYTEK(科大讯飞)
Shuai Shao(邵帅) iFLYTEK(科大讯飞)
Hao Wu(吴浩) iFLYTEK(科大讯飞)
Chenyu Liu(刘辰宇) iFLYTEK(科大讯飞)
Yixing Zhu(朱意星) USTC(中国科技大学)
Zhengyan Yang(杨争艳) iFLYTEK(科大讯飞)
Changjie Wu(吴昌杰) USTC(中国科技大学)
Mobai Xue(薛莫白) USTC(中国科技大学)
Jiajia Wu(吴嘉嘉) iFLYTEK(科大讯飞)
Bing Yin(殷兵) iFLYTEK(科大讯飞)
Cong Liu(刘聪) iFLYTEK(科大讯飞)
Jinshui Hu(胡金水) iFLYTEK(科大讯飞)
Jun Du(杜俊) USTC(中国科技大学)
Jianshu Zhang(张建树) USTC(中国科技大学)
Lirong Dai(戴礼荣) USTC(中国科技大学)