Parsing an XML string to XML and extracting node value in XSLT

160 Views Asked by At

I've been trying to parse an XML string to XML and extract the needed values using msxsl:node-set function in XSLT. Given below is the XSLT code

<xsl:template name="parseXML">
    <xsl:param name="xmlString"/>
    <xsl:copy-of select="msxsl:node-set($xmlString)/*"/>
  </xsl:template>


  <xsl:template match="/">
    <xsl:variable name="parsedXML">
        <xsl:call-template name="parseXML">
          <xsl:with-param name="xmlString" select="result/XML"/>
        </xsl:call-template>
      </xsl:variable>
    <element1><xsl:value-of select="$parsedXML"/></element1>
  </xsl:template>

I ran the XSLT on the below XML

<?xml version="1.0" encoding="UTF-8"?>
<result name="HostIntegrationRequest" id="0071" status="0">
  <XML>
    <![CDATA[<Document xmlns="urn:iso:std:iso:20022:tech:xsd:taco.097.001.90"><GrpHdr><MsgId>20230825066</MsgId></GrpHdr></Document>]]></XML>
  <RefId><![CDATA[4]]></RefId>
</result>

But the result XML is returning an empty value, I'm unable to see even the parsedXML value.

Some help to solve this issue is appreciated. If I want to select MsgId can I do

<element1><xsl:value-of select="$parsedXML/Document/GrpHdr/MsgId"/></element1> ?

2

There are 2 best solutions below

2
michael.hor257k On BEST ANSWER

The node-set() function cannot be used to unescape a string into a proper XML. In XSLT 1.0 you would need to do first:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>

<xsl:template match="/result">
    <xsl:value-of select="XML" disable-output-escaping="yes"/>
</xsl:template>

</xsl:stylesheet>

and then apply another transformation to the resulting file (which in your example would produce an error because GrpHdr does not have an ending tag).

Alternatively you could try and parse out the data using string manipulations - for example:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/result">
    <result>
        <xsl:value-of select="substring-before(substring-after(XML, '&lt;MsgId>'), '&lt;/MsgId>')"/>
    </result>
</xsl:template>

</xsl:stylesheet>

will return:

<?xml version="1.0" encoding="UTF-8"?>
<result>20230825066</result>

However, this is not truly parsing the XML and can easily fail if the lexical representation of the XML varies.

In XSLT 3.0 you can use the parse-xml() function to parse the escaped string as XML.

0
Martin Honnen On

Note that, with MSXML, you can also implement your own parseXml function as an extension. With the classic, COM based MSXML (supported versions are MSXML 3 and 6, I think) you can do that with JScript; with the .NET based XslCompiledTransform you can do that by calling into the .NET platform and constructing an XPathDocument from the string you pass in in .NET languages like C# or VB.NET.

For JScript with MSXML 6:

var xmlDoc = new ActiveXObject('Msxml2.DOMDocument.6.0');
xmlDoc.load('embedded-xml-sample1.xml');

var xsltDoc = new ActiveXObject('Msxml2.FreeThreadedDOMDocument.6.0');
xsltDoc.load('parseXmlExample1.xsl');

var xsltTemplate = new ActiveXObject('Msxml2.XSLTemplate.6.0');
xsltTemplate.stylesheet = xsltDoc;

var xsltProcessor = xsltTemplate.createProcessor();

xsltProcessor.addObject({ "parseXml": function(xml) { var doc = new ActiveXObject('Msxml2.DOMDocument.6.0'); doc.loadXML(xml); return doc; }}, 'http://example.com/mf');

xsltProcessor.input = xmlDoc;

xsltProcessor.transform();

WScript.Echo(xsltProcessor.output);

For .NET with XslCompiledTransform:

using System.Xml.XPath;
using System.Xml.Xsl;

XslCompiledTransform xsltProcessor = new XslCompiledTransform();
xsltProcessor.Load("parseXmlExample1.xsl");

var xsltArguments = new XsltArgumentList();
xsltArguments.AddExtensionObject("http://example.com/mf", new XsltExtensions());

xsltProcessor.Transform("embedded-xml-sample1.xml", xsltArguments, Console.Out);

public class XsltExtensions
{
    public static XPathNavigator parseXml(string xml)
    {
        var doc = new XPathDocument(new StringReader(xml));
        return doc.CreateNavigator();
    }
}

XSLT for both (in this case solely demonstrating that the extension function parses the XML into a node):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:mf="http://example.com/mf">
    
    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>
    
    <xsl:template match="XML">
        <xsl:copy>
            <xsl:apply-templates select="mf:parseXml(string())"/>
        </xsl:copy>
    </xsl:template>
    
</xsl:stylesheet>

Here is a Gist with examples for XslCompiledTransform and MSXML 6: https://gist.github.com/martin-honnen/c5070070d12ac4ba5c72d39afbdae65d