method: RALLM2023-07-05

Authors: xyy

Description: RALLM

method: Applica.ai TILT2021-04-11

Authors: Applica.ai Research Team

Affiliation: Applica.ai

Email: rafal.powalski@applica.ai, dawid.jurkiewicz@applica.ai

Description: TILT neural network architecture which simultaneously learns layout information, visual features, and textual semantics. Contrary to previous approaches, we rely on a encoder-decoder architecture. Results were obtained from single TILT-Large model pre-trained as described in a paper. Model was finetuned on challenge train set.

Ranking Table

Description Paper Source Code
Answer typeEvidenceOperation
DateMethodScoreImage spanQuestion spanMultiple spansNon spanTable/ListTextualVisual objectFigureMapComparisonArithmeticCounting
2022-03-02Human Performance0.97180.97450.97770.93350.97160.97800.97890.97700.96990.94330.97120.98370.9544
2023-07-05RALLM0.71750.74210.78840.08300.80310.68660.70880.73760.72140.80490.71410.80380.7916
2021-04-11Applica.ai TILT0.61200.67650.64190.43910.38320.59170.79160.45450.56540.44800.48010.49580.2652
2023-08-20PaLI-X (Google Research, Single Generative Model)0.54770.59400.69500.41220.35340.51450.68910.63730.50400.40130.42900.40530.3091
2023-10-09nnrc_udop_2240.42990.47160.52790.24100.27850.37400.57550.34750.39440.33470.29970.35830.1866
2022-09-18pix2struct-large0.40010.43080.48390.20590.31730.38330.52560.25720.37260.32830.27620.41980.2017
2021-04-09IG-BERT (single model)0.38540.41810.44810.21970.28490.33730.50160.30130.37060.33470.29390.35640.2000
2022-09-18pix2struct-base0.38200.41450.43810.16550.30140.33510.49710.23800.36320.32570.23440.40360.1888
2021-04-11NAVER CLOVA0.32190.39970.23170.10640.10680.26530.44880.18780.30950.32310.20200.14800.0695
2021-04-10Ensemble LM and VLM0.28530.33370.41810.07480.11690.24390.36490.23310.26450.28450.25800.16280.0647
2021-11-09LayoutLMv2 LARGE0.28290.34300.27630.06410.11140.24490.38550.14400.26010.31100.18970.11300.1158
2022-09-20BROS_BASE (WebViCoB 1M)0.28090.34360.24850.02770.13030.25450.36200.13180.27670.28860.22070.17450.0854
2022-03-03InfographicVQA paper model0.27200.32780.23860.04500.13710.24000.36260.17050.25510.22050.18360.15590.1140
2021-04-05BERT fuzzy search0.20780.26250.23330.07390.02590.18520.29950.08960.19420.17090.18050.01600.0436
2021-04-10BERT0.16780.21490.21170.01260.01520.14790.24500.10540.15050.17680.15780.01580.0185

Ranking Graphic