Remove parent in xml if child contains a particular string

127 Views Asked by At

I have quite a few XML docs that I want to delete particular children in.

I've found some regular expressions similar to what I do, but it never quite worked for my specific case, deleting more things than needed.

Maybe someone could help me out with this? I'm using Notepad++

The goal here is to delete every < Item type="CEntityDef"> body that contains string < parentIndex value="-1" />

<?xml version="1.0" encoding="UTF-8"?>
<CMapData>
 <entities>
  <Item type="CEntityDef">
   <archetypeName>something</archetypeName>
   <parentIndex value="255" />
  </Item>
  <Item type="CEntityDef">
   <archetypeName>something</archetypeName>
   <parentIndex value="2334" />
  </Item>
  <Item type="CEntityDef">
   <archetypeName>something_2</archetypeName>
   <parentIndex value="-1" />
  </Item>
  <Item type="CEntityDef">
   <archetypeName>something_2</archetypeName>
   <parentIndex value="-1" />
  </Item>
 </entities>
</CMapData>

Desired outcome

<?xml version="1.0" encoding="UTF-8"?>
<CMapData>
 <entities>
  <Item type="CEntityDef">
   <archetypeName>something</archetypeName>
   <parentIndex value="255" />
  </Item>
  <Item type="CEntityDef">
   <archetypeName>something</archetypeName>
   <parentIndex value="2334" />
  </Item>
 </entities>
</CMapData>

Thank you for reading!

1

There are 1 best solutions below

0
Michael Kay On

A very straightforward job for XSLT. In 3.0 it's

<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
 <xsl:mode on-no-match="shallow-copy"/>
 <xsl:template match="Item[@type='CEntityDef']
                          [parentIndex/@value='-1']"/>
</xsl:transform>

Don't try to do this kind of thing with regular expressions.

If it's a one-off requirement you could use an interactive tool like xmlstarlet or Saxon's Gizmo. In Gizmo it's simply

delete //Item[@type='CEntityDef'][parentIndex/@value='-1']