Text Extraction from Table Image with Table in Table

265 Views Asked by At

This is SampleTable.png

I performed OCR and found that the "Observation" column is unable to extract the text, they leave the column blank in the table extracted.

These the code I tried.

from img2table.document import Image
from img2table.ocr import PaddleOCR
 
ocr = PaddleOCR(kw={
'ocr_version': 'PP-OCRv3',
'structure_version': 'PP-StructureV2',
'det_model_dir': './Python_V10/PaddleOCR/ch_PP-OCRv3_det_infer',
'rec_model_dir': './Python_V10/PaddleOCR/en_PP-OCRv3_rec_infer',
'cls_model_dir': './Python_V10/PaddleOCR/ch_ppocr_mobile_v2.0_cls_infer',
'table_model_dir': './Python_V10/PaddleOCR/en_ppstructure_mobile_v2.0_SLANet_infer',
'layout_model_dir': './Python_V10/PaddleOCR/picodet_lcnet_x1_0_fgd_layout_infer',
'lang': 'en',
})
 
img = Image('SampleTable.png')
extracted_table = img.extract_tables(ocr = ocr, implicit_rows = False, borderless_tables = False, min_confidence = 0)
display_html(extracted_table[0].html_repr(title = 'Result'), raw = True)

I need help to extract the important text "According our result to the requirements, the balance..." and "After the monitor we found that the balance ... or equal to." from the table.

0

There are 0 best solutions below