Find overlapping keywords in a sentence using re/regex

72 Views Asked by At
import re
keyword_pattern='love|love India'
Sentence=I love India
Match=re.finditer(pattern, sentence)

Example Keywords=['love', 'love India', 'pakistan'] Sentence= 'I love India'

Output I need: ['love', 'love India']

Output I am getting: [love]

1

There are 1 best solutions below

0
Max On

The docs state that

re.finditer(pattern, string, flags=0)

Return an iterator yielding match objects over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result.

For the sake of convenience, we can also use re.findall, which basically does the same as [el.group() for el in re.finditer(...)], but has the same problem:

re.findall(pattern, string, flags=0)

Return all non-overlapping matches of pattern in string, as a list of strings or tuples. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result.

But since your have nicely separated groups of patterns, just loop over them:

import re

patterns = ["love", "love India"]
sentence = "I love India"

matches = []
for pat in patterns:
    matches += re.findall(pat, sentence)

yields for matches:

['love', 'love India']