.Net XML parsing with optional internal (sub)tags, am I doing this right?

19 Views Asked by At

Shipping is a feature, and this is working... but I have a very strong feeling there is a much easier way to do this that will let me do similar things in the future much more easily, so here I am...

I'm reading an Excel sharedstrings file that looks like this...

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="61" uniqueCount="60">
<si><t>Value</t></si><si><t>Notes</t></si>
<si><r><t xml:space="preserve">Time                            </t></r><r><rPr><b/><sz val="9"/><color theme="1"/><rFont val="Arial"/><family val="2"/></rPr><t>(Do not change!)</t></r></si>
<si><t>Date</t></si>

I need just the "inner" strings, "Time", "Date", etc. I wrote code to do this by looking for the <t> tags, and reading their Value into a array of strings. This particular file, however, has two strings in one cell, in Time, which is interpreted as two different strings and goes into two slots - bad.

So I rewrote the code to look for the si's instead...

            RD = New XmlTextReader(New StringReader(SBXMLData.Trim()))
            I = -1
            Do While RD.Read()
                If RD.NodeType = XmlNodeType.Element AndAlso RD.LocalName = "si" Then
                    'this is a cell for a string, so to start with we want to increment the index
                    I += 1
                    While RD.Read()
                        If RD.NodeType = XmlNodeType.EndElement AndAlso RD.LocalName = "si" Then
                            Exit While
                        End If
                        If RD.NodeType = XmlNodeType.Element AndAlso RD.LocalName = "t" Then
                            RD.Read()
                            If Strings.ContainsKey(I) Then
                                Strings(I) += RD.Value
                            Else
                                Strings.Add(I, RD.Value)
                            End If
                        End If
                    End While
                End If
            Loop

This works, but it seems like there should be a simpler way to accomplish this. I guess my question is "how do I get all the values from the path "si.t", no matter whether there be r or font or anything else in between.

I suspect Linq might have support for this, but our platform is 3.51, so that's out. And Xamarin is a target (sometimes) too. I know there are 3rd party parsers that will do this, but they might not be portable. So is there a canonical solution to this in basic .Net that I need to use?

0

There are 0 best solutions below