JODConverter & LibreOffice: convert doc to html with embedded images

3.1k Views Asked by At

I'm converting doc/docx files to html with JODConverter library(4.2.2) and LibreOffice (6.2). What I need is to save images as embedded in html file, but by default it saving in separate files.

In order to do that with LibreOffice command line interface i'm using:

soffice --convert-to html:HTML:EmbedImages example.docx

I'm wondering if there any way to pass option EmbedImages through JODConverter library?

My java code:

LocalConverter
    .make()
    .convert(new FileInputStream(docFile))
    .as(DefaultDocumentFormatRegistry.getFormatByMediaType(file.getMediaType().getName()))
    .to(htmlTempFile)
    .as(DefaultDocumentFormatRegistry.HTML)
    .execute();
1

There are 1 best solutions below

2
sbraconnier On BEST ANSWER

This would work:

final DocumentFormat format =
    DocumentFormat.builder()
        .from(DefaultDocumentFormatRegistry.HTML)
        .storeProperty(DocumentFamily.TEXT, "FilterOptions", "EmbedImages")
        .build();

LocalConverter
    .make()
    .convert(new FileInputStream(docFile))
    .as(DefaultDocumentFormatRegistry.getFormatByMediaType(file.getMediaType().getName()))
    .to(htmlTempFile)
    .as(format)
    .execute();