How to get the attributes of an XML tag using Ruby with REXML?

380 Views Asked by At

I want to output an xml file in a specific way only using REXML with Ruby. Here is what the xml file looks like:

<desc>
     <id>2408</id>
     <who name="Joe Silva">[email protected]</who>
     <when>Today</when>
     <thetext>Hello World</thetext>
</desc>
<desc>
     <id>2409</id>
     <who name="Joe Silva2">[email protected]</who>
     <when>Future</when>
     <thetext>Hello World Again</thetext>
</desc>

So far, here is the code I use:

document.elements.each("//desc") {
    |e| e.elements.each 
        |i| puts "#{i.name} : #{i.text}"
    puts "\n"
}

This gives me the following output:

commentid : 2408
who : [email protected]
bug_when : Today
thetext : Hello World

commentid : 2409
who : [email protected]
bug_when : Future
thetext : Hello World Again

I can access each tag's text but not their attributes. How do I get access to the attributes and get an output with the name attribute?

So the output I want is:

commentid : 2408
name : Joe Silva
who : [email protected]
bug_when : Today
thetext : Hello World

Let me know if further explanation is required.

1

There are 1 best solutions below

1
the Tin Man On

I wouldn't use REXML, but instead would use Nokogiri. Using that, I'd write something like:

require 'nokogiri'

doc = Nokogiri::XML(<<EOT)
<xml>
<desc>
     <id>2408</id>
     <who name="Joe Silva">[email protected]</who>
     <when>Today</when>
     <thetext>Hello World</thetext>
</desc>
<desc>
     <id>2409</id>
     <who name="Joe Silva2">[email protected]</who>
     <when>Future</when>
     <thetext>Hello World Again</thetext>
</desc>
</xml>
EOT

data = doc.search('desc').map{ |desc|
  desc.children.reject(&:text?).map { |desc_child|
    [desc_child.name, desc_child.text]
  }.to_h
}

data
# => [{"id"=>"2408",
#      "who"=>"[email protected]",
#      "when"=>"Today",
#      "thetext"=>"Hello World"},
#     {"id"=>"2409",
#      "who"=>"[email protected]",
#      "when"=>"Future",
#      "thetext"=>"Hello World Again"}]

That provides an array of hashes containing the node names and their text.

How to change the name of the nodes to be used as keys and print it out is left for you to figure out.