Html to PDF generation with Java

599 Views Asked by At

In our project, we have a task to generate PDF from HTML content. For that, we tried to use flying saucer and openhtmltoppdf, however, the HTML content that we are trying to generate contains CSS3 syntax, and seems that both of these libraries have poor support for CSS3. As a result generated PDF is incomplete and missing proper layout. My question is, if there is any way to generate a proper PDF that will look the same as in the web view in Java?

Here is the code snippet:

var document = Jsoup.connect(url).get();

try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) {
    PdfRendererBuilder builder = new PdfRendererBuilder();
    builder.withUri(uri);
    builder.toStream(outputStream);
    builder.withW3cDocument(new W3CDom().fromJsoup(document), "/");
    builder.run();

    return outputStream.toByteArray();
}

We were also trying to append all the CSS code to the HTML document, since the original HTML document contains external references to the static CSS pages. Here is the snippet:

for (Element link : document.select("link[rel=stylesheet]")) {
    String cssFilename = link.attr("href");

    Element style = new Element(Tag.valueOf("style"), "");
    var css = Jsoup.connect(baseUrl + cssFilename).get().body().text();

    style.appendText(css);
    link.replaceWith(style);
}
3

There are 3 best solutions below

0
Saurabh Deshmukh On

We did a similar thing, but with python, we used wkhtmltopdf. It also had poor css support. But when we used inline css. The generated pdf followed proper formatting. You may try using inline css. Not sure if it will help. But it helped us.

0
finger On

You can to try Chrome Headless print

cmd: google-chrome --headless --disable-gpu --print-to-pdf='/root/test/test.pdf' /root/test/test.html

0
Dheeraj Malik On

You can try Spire.PDF for Java. It provides functionality for rendering HTML with inline CSS to PDF. If you're using external CSS, you'll need to convert it into inline CSS to ensure proper rendering within the PDF document.