How to create a valid docx with a table using Open XML

222 Views Asked by At

So far every example I was able to find here or elsewhere (like https://stackoverflow.com/a/72016283/13831836) was a barebones code that creates something that may or may not be opened by MS Word, knowing Microsoft's attitude towards enforcing standards, but is not a valid document. And by valid I mean:

  • Libre Office opens it and displays more or less correctly (a must)
  • Open Office opens it and displays more or less correctly (a must)
  • MS Open XML Productivity Tool validates it (nice to have)

document in question is rather simple, a short paragraph followed by a large table (600+ rows) followed by another short paragraph.

I attempted to fix errors displayed by Productivity Toolkit on my own, alas it turned out to be too difficult without knowing much more about the SDK and the format than I do. Most of the errors have to do with invalid children of a table, a row, etc., or missing Styles and so on, some of them I could fix with the help of internet search, but not all.

I am not against using libraries built on top of The Open XML SDK provided the license is right, but again I wasn't able to find one that actually works yet.

1

There are 1 best solutions below

0
user246821 On

If you want to support Libre Office, Open Office, and Word, then

  1. Use each of them to create the document in the format that you desire.
  2. Open each of the documents with the other two programs (for example, if you created a document in Word, ensure that it successfully opens with Libre Office and Open Office).
  3. Once you've identified which document works with all three, then use the Open XML SDK 2.5 Productivity Tool to generate the code needed to programmatically create the document.

Given: One has identified that a document we created with LibreOffice also works correctly in OpenOffice and Word.

One may consider using the document created in LibreOffice as a template, and modify it using NuGet package DocumentFormat.OpenXml..

If desired, one can create the entire document programmatically. One may start by reading Welcome to the Open XML SDK for Office, as well, as the articles it references. However, there is an easier way. One can download and install the Open XML SDK 2.5 Productivity Tool which one can use to generate the code. Then, one can then modify the code to suit one's needs. One must understand that the Open XML SDK Productivity Tool hasn't been updated in a while and may not support the latest versions of OpenXML (ie: the latest version of DocumentFormat.OpenXml, LibreOffice, MS Office, etc...). The versions I used for testing, are mentioned throughout the post.


Pre-requisite:

Create a new Windows Forms App (name: OpenXmlLibreOfficeTest)

Note: For Framework, ensure you select .NET 8.

Open Solution Explorer

  • In VS menu, click View
  • Select Solution Explorer

Download/install NuGet package: DocumentFormat.OpenXml (v. 2.19.0)

  • In Solution Explorer, right-click <project name>
  • Select Manage NuGet Packages...
  • Click Browse tab
  • In search box, type DocumentFormat.OpenXml
  • Select desired version (ex: 2.19.0)
  • Click Install

Add a class to the project (name: HelperOpenXml.cs)

HelperOpenXml.cs:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace OpenXmlLibreOfficeTest
{
    public class HelperOpenXml
    {
    }
}

In the OP, you stated that you have a document that has a paragraph, then a table, then another paragraph. I created a document using LibreOffice (v.7.5.9) for testing (name: Test.docx) which looks like:

enter image description here

Open the .docx file using the Open XML SDK 2.5 Productivity Tool (ex: Test.docx).

enter image description here

Click Reflect Code

enter image description here

Copy all of the code shown in the Open XML SDK 2.5 Productivity Tool to our class (ex: HelperOpenXml.cs) and then rename the namespace,
GeneratedCode, to OpenXmlLibreOfficeTest and class GeneratedClass to HelperOpenXml.

Alternatively, one can first copy the using directives shown in Open XML SDK 2.5 Productivity Tool, and then copy the code within the class (GeneratedClass) to our class (ex: HelperOpenXml.cs) - this will eliminate the need to rename the namespace and class.

Build your application

  • In VS menu, select Build
  • Select Build <project name>

You may see errors such as the following:

enter image description here

If so, add the following using directives to your class (ex: HelperOpenXml.cs):

using Color = DocumentFormat.OpenXml.Wordprocessing.Color;
using Font = DocumentFormat.OpenXml.Wordprocessing.Font;
using FontFamily = DocumentFormat.OpenXml.Wordprocessing.FontFamily;

Then, Build your application

  • In VS menu, select Build
  • Select Build <project name>

To create the .docx file programmatically, just call CreatePackage.

Usage:

Note: I've added a button (name: buttonCreateDocument) to the form and double-clicked it to create the event handler.

private void buttonCreateDocument_Click(object sender, EventArgs e)
{
    using (SaveFileDialog sfd = new SaveFileDialog())
    {
        sfd.Filter = "OpenXml Document|*.docx";

        if (sfd.ShowDialog() == DialogResult.OK)
        {
            HelperOpenXml helperOpenXml = new HelperOpenXml();
            helperOpenXml.CreatePackage(sfd.FileName);
        }
    }
}

Now, one can modify the code within the class (ex: HelperOpenXml.cs) as desired. For example, you'll notice that all of the table data is, in essence, hard-coded. This code can be re-written.

Perhaps one has data stored in a DataTable or maybe one has a class

Computer

public class Computer
{
    public string? Make { get; set; }
    public string? Model { get; set; }
    public string? SerialNumber { get; set; }
}

and defines List<Computer> computers = new List<Computer>(); (see List<T>) As you probably already know, one would modify the code to create the table rows within a loop.