PDF Plumber Extracted Text

27 Views Asked by At

I have a PDF document that I am trying to extract text from. This is what I used:

with pdfplumber.open('myfile.pdf') as pdf:
     my_page=pdf.pages[3] ##It is the 3rd page I am working with
     text=my_page.extract_text()

print(text)

Here is what the PDF looks like:
    id     description               cost
    1      toy_car                   $10.00
    2      big_huge_description      $20.00
           _for_a_car
    3      toy_kitchen               $30.00

What happens is the PDF has some of the characters spilling over into the first column when I try to extract the data:

example:

id     description               cost
1      toy_car                   $10.00
2      big_huge_description      $20.00
_for_a_car
3      toy_kitchen               $30.00

how can I output the text so that it looks like this?

id     description                         cost
1      toy_car                             $10.00
2      big_huge_description_for_a_car      $20.00
3      toy_kitchen                         $30.00

Any suggestions?

0

There are 0 best solutions below