The stringValue of some elements from an XML files contain BOM characters in them. The xml file is marked as UTF-8 encoding.
Some of those characters are at the beginning of the string (as it should be from what I read about it) but some are in the middle of the string (malformed string from whoever wrote the xml file maybe?).
I'm opening the file with:
NSURL *furl = [NSURL fileURLWithPath:fileName];
if (!furl) {
NSLog(@"Error: Can't open NML file '%@'.", fileName);
return kNxADbReaderTTError;
}
NSError *err=nil;
NSXMLDocument *xmlDoc = [[NSXMLDocument alloc] initWithContentsOfURL:furl options:NSXMLNodeOptionsNone error:&err];
And I query the element this way:
NSXMLElement *anElement;
NSString *name;
...
NSString *valueString = [[anElement attributeForName:name] stringValue];
My questions are:
Am I opening the file wrong? Is the file malformed? Am I querying the string value of the element wrong? How can I filter those characters out?
While fixing another issue, I found a relatively clean way of filtering out unwanted characters from the source of an NSXMLDocument. Pasting it here just in case someone encounters a similar issue:
You can pass any character set you want. Note that this sets the options for reading the XML document to none. You might want to change this for your own purposes.
This only filters the content of attributes strings, which is where my malformed string came from.