method: Ant-FinCV2023-03-16

Authors: Tao Huang, Jie Wang, Tao Xu

Affiliation: Ant Group

Description: End-to-End OCR free based transformer for document understanding. The encoder maps a document image into embeddings and the decoder generates a sequence of tokens by the encoded embeddings, where the tokens can be converted into a kv type of entity linking in a structured form string. 90% of the trained data is used and epoch is 300, and multiline keys and values are splited. The final result is corrected by the ocr output.