Download a pdf direct to memory to use it with python

220 Views Asked by At

The goal is to download a pdf file via requests (Python) without saving it on the hard disk. The i'd like to access it with PdfReader from PyPDF2, again without saving it.

def readFile(self, id):
        req = get(f'{self.workUrl}{id}/content', headers={'OTCSTicket': self.ticket})
        if req.status_code == 200: return req.raw
        else: raise Exception(f'Error Code {req.status_code}')

obj = server.readFile(id)
reader = PdfReader(obj)
1

There are 1 best solutions below

0
Robert Fisher On BEST ANSWER

Instead of simply returning the raw object, you can wrap it or the req.content variable in io.BytesIO, which creates a file-like object you can open with PdfReader.

Like this:

def readFile(self, id):
    req = requests.get(
        url=f'{self.workUrl}{id}content/',
        headers={'OTCSTicket': self.ticket}
)
    if req.ok:
        return io.BytesIO(req.content)
    raise Exception(f'Error Code:  {req.status_code}')

reader = PdfReader(readFile(id))