Authors: Ning Lu*, Wenwen Yu*, Xianbiao Qi*, Yihao Chen, Rong Xiao
Affiliation: Ping An Property & Casualty Insurance Co
Description: We propose a novel approach MASTER: Multi-Aspect Non-local Network for Scene Text Recognition, a self-attention based scene text recognizer. It consists of two key modules, a Multi-Aspect Global Context Attention (GCAttention) based encoder and a Transformer based decoder. The proposed MASTER owns three advantages: (1) The model can both learn input-output attention and self-attention which encodes feature-feature and target-target relationships inside the encoder and decoder. (2) Experiments demonstrate that the proposed method is more robust to spatial distortion. (3) The training process of the proposed method is highly parallel and efficient. Experiments on standard benchmarks demonstrate it can achieve the state-of-the-art performances regarding both efficiency and recognition accuracy.