Extract PDF Content Including Images For RAG

152 Views Asked by Niyooooo At 26 February 2024 at 09:43

I am trying to build a PDF content extraction and chunking system for RAG in my application. I need to include images from pdf as urls,so that the llm can use that images in the response most of the solutions that i have seen only extract text content from pdf.Is there any way to extract images and text from pdf ?

Original Q&A

There are 1 best solutions below

Nick Magnanini - preprocess.co On 07 March 2024 at 08:25

PyMuPDF allows you to do that for images and tables

Extract PDF Content Including Images For RAG

There are 1 best solutions below

Related Questions in PDF

Related Questions in PDF-GENERATION

Related Questions in INFORMATION-RETRIEVAL

Related Questions in LARGE-LANGUAGE-MODEL

Related Questions in RETRIEVAL-AUGMENTED-GENERATION

Trending Questions

Popular # Hahtags

Popular Questions