I want to exclude the page number of a PDF from the actual text using pypdf package
from pypdf import PdfReader
reader = PdfReader("pdf-examples/kurdish-sample-2.pdf")
full_text = ""
for page in reader.pages:
full_text += page.extract_text() + "\n"
print(full_text)
Output:
5 دوارۆژی ئەم منداڵه بکەنەوە کە چۆن و چی بەسەر دێت و دووچاری
The number 5 is the page number which should be excluded.
You can use
passmethod if the count of iterations is 5, just like this:Here we use
ias iterations counter and add it everytime the page is read we add the counter by one. So ifiis 5, that means that we are on the fifth page, and we just dont do anything with it by writingpass.