Authors: Homa Foroughi, Chang Liu , Tharathorn Joy Rimchala, Terrence J. Torres
Description: We use a multi-step approach for Task2:
- [Optional] We perform some pre-processing on image, which includes image blurring, followed by some morphological operations.
- We pass images to an Optical Character Recognition (OCR), and get the text and coordinates for the detected text in image.
- OCR is not necessarily detect boxes in order, so we order them based on their y coordinate
- Then, we have 3 main steps:
o We merge small boxes initially found at the character level [or sub-words], based on the their size/location to group larger boxes, followed by box sorting.
o We find overlapping boxes, remove smaller and intersected ones, and form the biggest proper box at each line, followed by box sorting.
o Above steps give us the line-level aggregation, we then preform column splitting and sub-line aggregation to get boxes at word-level.
- We detect largest box around image (in many cases it is detected by OCR), and remove it.
- We remove very small boxes (which could be false positives).
We use corresponding text for each finally detected box.