I found a bug (I think) using the 2.13.4 version of vtd-xml. Well, in short I have the following snippet code:
String test = "<catalog><description></description></catalog>";
VTDGen vg = new VTDGen();
vg.setDoc(test.getBytes("UTF-8"));
vg.parse(true);
VTDNav vn = vg.getNav();
//get nodes with no childs, text and attributes
String xpath = "/catalog//*[not(child::node()) and not(child::text()) and count(@*)=0]";
AutoPilot ap = new AutoPilot(vn);
ap.selectXPath(xpath);
//block inside while is never executed
while(ap.evalXPath()!=-1) {
System.out.println("current node "+vn.toRawString(vn.getCurrentIndex()));
}
and this doesn't work (=do not find any node, while it should find "description" instead). The code above works if I use the self closed tag:
String test = "<catalog><description/></catalog>";
The point is every xpath evaluator works with both version of the xml. Sadly I receive the xml from an external source, so I have no power over it... Breaking the xpath I noticed that evaluating both
/catalog//*[not(child::node())]
and
/catalog//*[not(child::text())]
give false as result. As additional bit I tried something like:
String xpath = "/catalog/description/text()";
ap.selectXpath(xpath);
if(ap.evalXPath()!=-1)
System.out.println(vn.toRawString(vn.getCurrentIndex()));
And this print empty space, so in some way VTD "thinks" the node has text, even empty but still, while I expect a no match. Any hint?
TL;DR
The long story ...
I faced the same issue. Here are the main three options I first thought of (by order of difficulty) :
1. Turn empty elements into self closed tags in the XML source.
This option isn't always possible (like in OP case). Moreover, it may be difficult to "pre-process" the xml before hand.
2. Use XMLModifier to fix the VTDNav.
Find the empty elements with an xpath expression, replace them with self closed tags and rebuild the VTDNav.
2.bis Use XMLModifier#removeToken
A lower level variant of the preceding solution would consist in looping over the tokens in VTDNav and remove unecessary tokens thanks to XMLModifier#removeToken.
3. Patch the vtd-xml code directly.
Taking this path may require more effort and more time. IMO, the optimized vtd-xml code isn't easy to grasp at first sight.
Option 1 wasn't feasible in my case. I failed implementing Option 2bis. The "unecessary" tokens still remained. I didn't look at Option 3 because I didn't want to fix some (rather complex) third party code.
I was left with Option 2. Here is an implementation:
Code
Output