There are some user defined entites in the xml data. In order to unescape those entities, we are using below code:-
<xsl:stylesheet version='3.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' >
<xsl:output method="xml" omit-xml-declaration="no" use-character-maps="mdash" />
<xsl:character-map name="mdash">
<xsl:output-character character="—" string="&mdash;"/>
<xsl:output-character character="&" string="&amp;" />
<xsl:output-character character=""" string="&quot;" />
<xsl:output-character character="'" string="&apos;" />
<xsl:output-character character="§" string="&sect;"/>
<xsl:output-character character="$" string="&dollar;" />
<xsl:output-character character="/" string="&sol;" />
<xsl:output-character character="-" string="&hyphen;" />
</xsl:character-map>
<!--=================================================================-->
<xsl:template match="@* | node()">
<!--=================================================================-->
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
But there is a special case where § is appearing twice in data, for example:-
Ex- The number §§ 1234
The above should example should be converted to a special userdefined entity i.e.
Output- The number &multisect; 1234
The §§ should be converted to &multisect;
If you want to use a character map, you would first need to process text nodes where you expect the two sect characters to be present and replace them with a single character you don't expect to be used elsewhere; that character could then be converted by the map to the string
&multisect;e.g. the stylesheettransforms the input
into the output
Note that I used
'«'primarily as an example, you might want to need to use a private char or some other character you are sure doesn't occur in your input/output data.If you want the result to be well-formed you would also need to add a doctype to the output with e.g.
xsl:output doctype-system="some.dtd"where you ensure thatsome.dtddeclares e.g.<!ENTITY multisect "§§">