xml2 - Adding parent node attributes to data frame

27 Views Asked by At

I am dealing with an xml like this, with many subnodes. The same attribute name may be used in different node levels.

<reporte>
  <reporte_cabezal fecha_generado="08/05/2023 19:02" cant_compras="1539"/>
  <reporte_dato>
    <compra id_compra="1022855" id_ucc="65" num_compra="3">
      <items>
        <item nro_item="1" cant_pedida="1200.00" id_articulo="26058">
          <atributos_item>
            <atributo_item id_prop_atributo="4" desc_prop_atributo="TIPO">
              <atributo_valores>
                <atributo_valor valor_texto="DOBLE ENVOLTORIO"/>
              </atributo_valores>
            <atributo_item id_prop_atributo="5" desc_prop_atributo="MARCA">
          </atributos_item>
        <item nro_item="2" cant_pedida="1300.00" id_articulo="26048">
      </items>
      <aclaraciones_lla>
        <aclaracion texto="PARA"/>
        <aclaracion texto="Acta de Apertura" fecha="21/04/2023 12:31"/>
      </aclaraciones_lla>
    </compra>
 <compra...
 ...
</reporte_dato>
</reporte>

I am trying to get a data frame for each node, with some sort of key/keys that would allow for matching.

I wrote this function, which works returns each node as a df:

xml_to_tibbles <- function(xml) {
  xml_object <- xml %>% 
    read_xml()
  
  nodes <- xml_object %>% 
    xml_find_all("//*") %>%
    xml_name(data) %>%
    unique()
  
  nodes %>%
    map(function(node) {
      xml_object %>%
        xml_find_all(paste0("//", node)) %>%
        map(xml_attrs) %>%
        bind_rows() %>%
        clean_names()
    }) %>%
    set_names((make_clean_names(nodes))) %>%
    keep(~ ncol(.x) > 0) %>% 
    return()
}

However, I don't have anyway to match the rows of each df to the rows of the parent/child nodes df because I lack keys.

The first attribute of each node works as a key. I am looking for a way to add to the dfs the first attribute of all the ancestor nodes. Or perhaps there is a much better way of achieving what I am looking for.

0

There are 0 best solutions below