docx4j-ImportXHTML overrides style's font details with calculated CSS font

220 Views Asked by At

I am using docx4j-ImportXHTML to convert xhtml to a docx file. My document headings (i.e. h1, h2, etc) are being mapped to styles in my Word document. However, the styles' font details are being overridden by calculated CSS font properties. If no explicit CSS is present, then I see the font properties defined in docx4j.properties.

  • docx4j-JAXB-ReferenceImpl:11.4.9
  • docx4j-ImportXHTML:11.4.8
  • I have a reference docx file from which I copy styles to my generated file.
  • I am using the setting docx4j-ImportXHTML property docx4j-ImportXHTML.Element.Heading.MapToStyle=true.
  • I am playing with formatting options on the XHTMLImporterImpl to customize the behavior.
docx4j-ImportXHTML.fonts.default.serif=Roboto
docx4j-ImportXHTML.fonts.default.sans-serif=Roboto
docx4j-ImportXHTML.fonts.default.monospace=Courier New
docx4j-ImportXHTML.Element.Heading.MapToStyle=true
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
wordMLPackage.getMainDocumentPart()
             .getStyleDefinitionsPart()
             .setJaxbElement(documentStyles);

XHTMLImporterImpl xHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
xHTMLImporter.setParagraphFormatting(FormattingOption.CLASS_TO_STYLE_ONLY);
//xHTMLImporter.setRunFormatting(FormattingOption.CLASS_TO_STYLE_ONLY);

List<Object> parts = xHTMLImporter.convert(html, null);
wordMLPackage.getMainDocumentPart().getContent().addAll(parts);

For headers specifically, I would like the Word style to dictate appearance (i.e. font, font size, color and spacing). Here is how my Heading 3 style is defined:

enter image description here

And here is how it is applied:

enter image description here

There is no CSS in my document targeting h3 elements. I believe the font family is dictated by docx4j-ImportXHTML and the font size is being dictated by a default stylesheet.

My Heading 3 style is based on the Normal paragraph style, and it feels like the calculated CSS properties are being applied to the text run inside the paragraph.

Is it possible to configure docx4j-ImportXHTML to only use the Word doc styles? I'm not sure if I'm missing some details or if I'm working against the spirit of the library.

0

There are 0 best solutions below