method: BERT Large Ensemble2023-03-18
Authors: ZhuangZhuang Cai
Affiliation: GammaLab, Pingan
Email: caizhuang588@pingan.com.cn
Description: News video question-answering method based on OCR, layout analysis, object tracking, and ASR technologies. The method utilizes OCR technology to recognize text in video frames, uses layout analysis to merge paragraphs, employs object tracking algorithms to remove duplicate text in video frames, and finally uses ASR technology to transcribe speech in video clips. The OCR de-duplicated text and ASR text are concatenated to form the context for an extractive question-answering task,we then fine-tuned the pre-trained model. Our method achieved competitive results in the ICDAR2023 NewsVideoQA competition, demonstrating the effectiveness of using OCR and ASR technologies for news video question-answering.
method: bert-squad2-single2023-03-18
Authors: Daquan
Affiliation: None
Email: lindq@shanghaitech.edu.cn
Description: This paper presents an OCR and ASR-based approach for the news video question-answering task. Our approach leverages OCR technology to recognize text in video frames and ASR technology to transcribe the speech in video clips. We then concatenate the OCR and ASR text to form the context for the extractive question-answering task. Our approach achieved competitive results in the ICDAR2023 NewsVideoQA competition, demonstrating the effectiveness of using OCR and ASR technology for news video question-answering.
method: newsvqa model.2023-03-21
Authors: Rakshitha R T, Bhoomika Kumta,Soumya jituri
Affiliation: KLE Technological University
Email: 01fe20bcs107@kletech.ac.in
Description: NewsVideoQA ["Watching the News: Towards VideoQA Models that can Read"] a Text-Based VideoQA dataset that consists of video that contain scene-text in them which is necessary to answer a given question. We finetuned the dataset on GIT ["GIT: Generative Image-to-text Transformer for Vision and Language"] pretrained model(it was pretrained on 0.8 billion data). GIT is SOTA of few VideoQA datasets like MSRVTT-QA etc.
Date | Method | ANLS | ACC | |||
---|---|---|---|---|---|---|
2023-03-18 | BERT Large Ensemble | 0.7226 | 0.6251 | |||
2023-03-18 | bert-squad2-single | 0.7035 | 0.6072 | |||
2023-03-21 | newsvqa model. | 0.3234 | 0.2724 |