How to ignore the n first tabulations in an XSLT template

86 Views Asked by At

The situation

In my XML files, I could have peace of code to show inside the tag <code>. But the indentation of my XML document is in conflict with the tabulation inside the <code> section.

Minimal Working Example

The XML file
        <article>
            <code lang="c">
            #include &lt;stdio.h&gt;
            int main() {
                // printf() displays the string inside quotation
                printf("Hello, World!");
                return 0;
            }
            </code>
        </article>
The peace of XSLT
    <xsl:template match="code">
        <pre><xsl:value-of select="."/></pre>
    </xsl:template>
The expected HTML rendering
<pre>   #include &lt;stdio.h&gt;
    int main() {
    // printf() displays the string inside quotation
    printf("Hello, World!");
    return 0;
}</pre>

Explanations

As you see, the goal is to ignore the n first tabulations and the n last tabulation (if any) inside the tags, when n is equal to the number of tabulation before the opening tag <code>. And also to ignore the first new line, and the last new line (the one just before the tabulations before the closing </code> tag).

More explanations

According to @michael.hor257k suggestion to bring more clarifications, in other terms, the XSLT style sheet should treat the XML <code> part shown above like if it was like this:

        <article>
            <code lang="c">#include &lt;stdio.h&gt;
int main() {
    // printf() displays the string inside quotation
    printf("Hello, World!");
    return 0;
}</code>
        </article>

As you see the tabs bellonging to the XML indentation should not be included in the final HTML <pre> tag.

In more graphical way, we can say that the tabs corresponding to the tabs commented bellow should be ignored in the processing:

        <article>
            <code lang="c"><!--
         -->#include &lt;stdio.h&gt;
<!--     -->int main() {
<!--     -->    // printf() displays the string inside quotation
<!--     -->    printf("Hello, World!");
<!--     -->    return 0;
<!--     -->}<!--
         --></code>
        </article>

An this spaces, tabs, and new lines are corresponding to the XML indentation and not to the internal C code indentation.

Conclusion — Question

So, is it possible in my XSLT to parse the number of tabs before the opening <code> tag in order to delete them from the beginning of each content’s line?

1

There are 1 best solutions below

0
michael.hor257k On BEST ANSWER

Try perhaps something like:

XSLT 1.0 + EXSLT

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:str="http://exslt.org/strings"
extension-element-prefixes="str">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="code">
    <xsl:variable name="indent" select="substring-after(preceding-sibling::text(), '&#10;')" />
    <pre>
        <xsl:for-each select="str:tokenize(., '&#10;')[normalize-space()]">
            <xsl:value-of select="substring-after(., $indent)"/>
            <xsl:if test="position()!=last()">
                <xsl:text>&#10;</xsl:text>
            </xsl:if>
        </xsl:for-each>
    </pre>
</xsl:template>

</xsl:stylesheet>

Note that there are some assumptions here that your example satisfies, but other cases may not.