I have an xml file and I want to compare ids from entry node to id from reaction node and if are the same like the example below I want to access all the information of reaction (substrate id and product id). I have two product id and this code gives the first one Here is the XML file
<?xml version="1.0"?>
<!DOCTYPE pathway SYSTEM "http://www.kegg.jp/kegg/xml/KGML_v0.7.1_.dtd">
<!-- Creation date: May 31, 2012 14:53:24 +0900 (GMT+09:00) -->
<pathway name="path:ko00010" org="ko" number="00010" >
<entry id="13">
</entry>
<entry id="37" >
</entry>
<reaction id="13" name="rn:R01070" type="reversible">
<substrate id="105" name="cpd:C05378"/>
<product id="132" name="cpd:C00118"/>
<product id="89" name="cpd:C00111"/>
</reaction>
</pathway>
Here is my code
use strict;
use warnings;
use XML::Simple;
my $xml = new XML::Simple;
my $data = $xml->XMLin("file.xml");
foreach my $entry (keys %{$data->{entry}}) {
foreach my $reaction (keys %{$data->{reaction}}) {
if ($data->{reaction}->{id} eq $data->{entry}->{$entry}->{id} ){
print "substrate:::$data->{reaction}->{substrate}->{id}\n";
print "product:::$data->{reaction}->{product}->{id}\n";
}
}
}
XML::Simple is anything but simple. Its own documentation discourages further use of that module.
The data structure you might be getting (who knows?) is on my system:
It is always good to inpect a data structure when you are not sure if you are accessing it correctly. One way to do so is
use Data::Dumper; print Dumper $data.You might notice that there is no field for
idin theentry. Also, theproducts do not have an ID field, rather using thenameattribute as a name. *Sigh* – this kind of “cleverness” is why you shouldn't be using XML::Simple.It is far easier to use a proper parser like
XML::LibXML. We can then use XPath to select nodes we want:Output: