Regex Puzzle: Match a pattern only if it is between two $$ without indefinite look behind

143 Views Asked by At

I am writing a snippet for the Vim plugin UltiSnips which will trigger on a regex pattern (as supported by Python 3). To avoid conflicts I want to make sure that my snippet only triggers when contained somewhere inside of $$___$$. Note that the trigger pattern might contain an indefinite string in front or behind it. So as an example I might want to match all "a" in "$$ccbbabbcc$$" but not "ccbbabbcc". Obviously this would be trivial if I could simply use indefinite look behind. Alas, I may not as this isn't .NET and vanilla Python will not allow it. Is there a standard way of implementing this kind of expression? Note that I will not be able to use any python functions. The expression must be a self-contained trigger.

3

There are 3 best solutions below

1
josephdiniso On

The following should work:

re.findall("\${2}.+\${2}", stuff)

Breakdown:

Looks for two '$'

"\${2}

Then looks for one or more of any character

.+

Then looks for two '$' again

1
Haleemur Ali On

I believe this regex would work to match the a within the $$:

text = '$$ccbbabbcc$$ccbbabbcc'
re.findall('\${2}.*(a).*\${2}', text)
# prints
['a']

Alternatively:

A simple approach (requiring two checks instead of one regex) would be to first find all parts enclosed in your quoting text, then check if your search string is present withing.

example

text = '$$ccbbabbcc$$ccbbabbcc'
search_string = 'a'
parts = re.findall('\${2}.+\${2}', text)
[p for p in parts if search_string in p]
# prints
['$$ccbbabbcc$$']
3
Booboo On

If what you are looking for only occurs once between the '$$', then:

\$\$.*?(a)(?=.*?\$\$)

This allows you to match all 3 a characters in the following example:

  1. \$\$) Matches '$$'
  2. .*? Matches 0 or more characters non-greedily
  3. (?=.*?\$\$) String must be followed by 0 or more arbitrary characters followed by '$$'

The code:

import re

s = "$$ccbbabbcc$$xxax$$bcaxay$$"

print(re.findall(r'\$\$.*?(a)(?=.*?\$\$)', s))

Prints:

['a', 'a', 'a']