Trying to build this small demo using poppler fails to compile but poppler.h is found?

64 Views Asked by At

I found a short c++ example on stack overflow that extracts text from a pdf using c++ poppler, but cannot find any call to extract the images.

I have a pdf which is a sequence of scans from a copier. I would like to open them one after another

#include <iostream>

#include "poppler-document.h"
#include "poppler-page.h"
#include "poppler-image.h"
using namespace std;

int main()
{
    poppler::document *doc = poppler::document::load_from_file("./test1dld_scan.pdf");
    const uint32_t num_pages = doc->pages();
    cout << "page count: " << num_pages << endl;

    for (int i = 0; i < num_pages; ++i) {
        const auto page = doc->create_page(i);
        if (!page) {
            std::cerr << "Unable to create the page." << std::endl;
            return 1;
        }
        auto images = page-> // right here....

}

There is a poppler::image class, but I can't find anywhere in the documentation where I can get it out. There is no mention of image in the document, and none on the page either.

1

There are 1 best solutions below

2
Dov On

Ok, I figured it out. I am not aware of any API to extract the actual image from a PDF file. But the renderer can render your pdf to any desired DPI. Of course, I would like the DPI to match the resolution of the image inside.

I have a pdf of scanned images. I render it at 300 dpi and it works:

#include <iostream>

#include "poppler-document.h"
#include "poppler-page.h"
#include "poppler-page-renderer.h"
#include "poppler-image.h"
using namespace std;
using namespace poppler;

int main() {
    const int DPI = 300;
    document *doc = document::load_from_file("./test.pdf");
    const uint32_t num_pages = doc->pages();
    cout << "page count: " << num_pages << endl;

    page_renderer renderer;
    renderer.set_render_hint(page_renderer::text_antialiasing);
    for (int i = 0; i < num_pages; ++i) {
        
        const auto page = doc->create_page(i);
        if (!page) {
            cerr << "Unable to create the page." << endl;
            return 1;
        }

        image img = renderer.render_page(page,DPI,DPI);
        cout << "created image of  " << img.width() << "x"<< img.height() << "\n";
        img.save("page"+to_string(i)+".png", "png", DPI);
    }
}