How to extract specific text from a pdf using python?

116 Views Asked by At

These are the items which are needed to be extracted from the pdf:

enter image description here

This is the link to the PDF.

Could anyone solve this problem using Python with proper comments to help me understand?

import pdf2image
from PIL import Image
import pytesseract

image = pdf2image.convert_from_path('/content/SRW1012022Y0002378_220216102321.PDF')
for pagenumber, page in enumerate(image):
    detected_text = pytesseract.image_to_string(page)
    print(detected_text)

I tried the above code snippet, and I can extract all the text from pdf, but I can't grab specific text to continue applying logic to it.

0

There are 0 best solutions below