REGEX - replace text between end tag & start tag

339 Views Asked by At

I want to remove the text between the end of one HTML tag and the beginning of another.

The tags on the page have different text between them. There are of course multiple different blocks too delete on the page.

</h1>
Section: ab (1)<br>Updated: 2015-05-01<br><a href="file:///home/gareththomasnz/Desktop/VirtualBoxShare/merged.html#2_index">Index</a>
<a href="file:///man/man2html">Return to Main Contents</a><hr>

<p>
<a name="2_lbAB">&nbsp;</a>
</p><h2>

Everything in between /H1 and H2 tags through the whole page must be deleted.

Tried a few things but cant get it to work - any suggestions?

2

There are 2 best solutions below

0
Gareth Thomas On

http://sundstedt.se/blog/delete-specific-text-blocks-between-two-characters/

this is a solution

Deletes a random text block between any characters without using regex

0
Bohemian On

Turn on DOTALL and use a reluctant quantifier:

Search: (?s)(?<=</h1>).*?(?=<h2>)
Replace: <blank>

Note: I'm not familiar with powergrep, so it may use "slash delimited" regex syntax, so:

/(?<=</h1>).*?(?=<h2>)/s