How to use XMLStarlet to lookup values in a second file

57 Views Asked by At

Let's assume we have two directories:

/home/a
/home/b

In directory a with have lots of XML files like this:

<root>
  <id>87182378127381273</id>
  <name>just a name</name>
</root>

and for each of the id we find an XML file in directory b, like:

/home/b/87182378127381273.xml
...

and in that file we have for instance:

<root>
  <counter1>879</counter1>
</root>

And now I just want to run an xmlstarlet command that outputs the following for each found XML in directory a:

87182378127381273,just a name,879
...

I tried to solve this by this xmlstarlet command:

find . -iname '*.xml' | xargs xmlstarlet sel \
  -t -m "/root" -i "./id" -v "./id" -o -v "./name" -b -n | grep -v ^$

Now I wanted to use the --var option and load the second XML by constructing the file path with the value of id and output the timestamp value, but I don't know how. Any idea?

1

There are 1 best solutions below

8
urznow On BEST ANSWER

Answer rewritten after question was rephrased.

This should do it:

# shellcheck  shell=sh  disable=SC2016
find '/home/a' -type f -iname '*.xml' -exec xmlstarlet select --text \
  -t -m 'root[string(id)]' \
       --var bpath='concat("/home/b/",id,".xml")' \
       --var ct='document($bpath)/root/counter1' \
       -v 'concat(id,",",name,",",$ct)' -n \
  {} +
  • why not skip xargs when find can invoke a command with as many filenames ({} +) as the command line can hold and repeat as needed
  • use an XPath predicate to match a root element with a non-empty id child element, no output is generated if root[string(id)] isn't matched
  • use the XSLT document function to look up a value in an external XML file
    • if using a relative pathname with document() it must be relative to the directory in which xmlstarlet is invoked (i.e. the current directory)
    • if the target file is inaccessible xmlstarlet will issue a failed to load external entity "…" error message
  • use the XPath concat function to stringify one record
  • don't forget select's -T (aka --text) option for plaintext output