method: ReCTS_HWY2019-04-29

Authors: HuWenyang

Description: A 2D-Attention based single model. Use a ResNet50 backbone to extract feature, and an attention-based decoder to predict outputs. The attention model is learned by the resnet50 feature map. No any public or private datasets are used. (NanyangTechnologicalUniversity student, huwe0013@e.ntu.edu.sg)