Authors: Yue Li, Guangwei Huang, TingTing Wang, BOE_IOT_AIBD
Our approach is divided into three parts: text extraction, text classification and text archiving.
Text extraction refers to the detection method of yoloV3 with darknet53 and the connectionist text proposal network of CTPN. Besides, multi-scale training , training data augment and Bar Code detection are used.
Text classification uses the BERT algorithm.We used the training set to label some data for fine tuning the Bert network.
Text archiving is the sorting of the recognized text information. The addresses are merged, and the date and amount in the text are extracted using regularization.