Regular Expression to exclude a String around the required String

120 Views Asked by whitewiz At 07 July 2020 at 14:51

In between a HTML code:

...<div class="..."><a class="..." href="...">I need this String only</a></div>...

How do I write Regular Expression (for Rainmeter which uses Perl RegEx) such that:

-required string "I need this String only" is grouped to be extracted,

-the HTML link tag <a>...</a> might be absent or present & can be present in between the required string and multiple times as well.

My attempt:

(?siU) <div class="...">.*[>]{0,1}(.*)[</a>]{0,1}</div> where:

.*= captures every characters except newline{<a class ... "}
[>]{0,1}= accepts 0 or 1 times presence of > {upto >}
(.*)= captures my String
[</a>]{0,1}= accepts 0 or 1 times presence of </a>

this, of course, doesn't work as I want, This gives output with HTML linking preceding my string so my question is

How to write a better(and working) RegEx?

Original Q&A

There are 1 best solutions below

joanis On 07 July 2020 at 15:17

Even though I agree with the advice to use a real parser for this problem, this regular expression should solve your problem:

<div [^.<>]|*>(?:[^<>]*<a [^<>]*>)*([^<>]*)(?:</a>)*</div>

Logic:

require <div ...> at the beginning and </div> at the end.
allow and ignore <a ...> before the matched text arbitrarily many times
allow and ignore </a> after the matched text arbitrarily many times
ignore any text before any <a ...> with [^<>]* in front of it. Using .* would also work, but then it would skip all text arbitrarily up to the last instance of <a ...> in your string.
I use [^<>]* instead of .* to match non-tag text in a protected way, since literal < and > are not allowed.
I use (?:...) to group without capturing. If that is not supported in your programming language, just use (...) instead, and adjust which match you use.

Caveat: this won't be fully general but should work for your problem as described.

Regular Expression to exclude a String around the required String

There are 1 best solutions below

Related Questions in REGEX

Related Questions in REGEX-LOOKAROUNDS

Related Questions in RAINMETER

Related Questions in REGEX-LOOK-AHEAD

Trending Questions

Popular # Hahtags

Popular Questions