Authors: Quan Chen, Tiezheng Ge, Zhiqiang Zhang, Minghui Li, Kun Gai
Description: We approach this task with a combination of three deep neural networks and a language model. Specifically , a LSTM model is used to accomplish word recognition based on the features generated by a CNN model. The final words are decoded by a bi-gram language model and their locations are refined by a location regression network. Two internal text corpora are involved in the training procedure. For "strongly" and "weakly" version, the given corresponding vocabulary is simply used as the final output filter.