Authors: Xu Liu, Tao Wei
Description: Our method is based on 2D-attention, we use ResNet as backbone and a tailored 2D-attention module is applied. The result is generated by single model without ensemble tricks.
Authors: Hancom Vision team
Description: Our model is featured by CNN-based, BiLSTM, and Attention.
Trained on MJSynthText + SynthText + external data (Pretrain), Focused Scene Text 2013-2015, and Incidental Scene Text 2015.
Authors: Jianzhong Xu, Miao Wang, Lulu Xu, Long Ma, Xuefeng Su
Description: Our method is based on encoder-decoder framework. We use SE-ResNet as the backbone, and 2-layer Bidirectional RNN with residual connection to decode.