I have come accross a different behaviour of search function in regex which made me think that there is an implicit \b anchor in the pattern. Is this the case?
<code>
text = "bowl"
print(re.search(r"b|bowl", text)) # first alteration in this pattern works
print(re.search(r"o|bowl", text)) # but first alteration won't work here
print(re.search(r"w|bowl", text)) # nor here
print(re.search(r"l|bowl", text)) # nor here
print(re.search(r"bo|bowl", text)) # first alteration in this pattern works
print(re.search(r"bow|bowl", text)) # first alteration in this pattern works
</code>
<br />
OUTPUT
<re.Match object; span=(0, 1), match='b'>
<re.Match object; span=(0, 4), match='bowl'>
<re.Match object; span=(0, 4), match='bowl'>
<re.Match object; span=(0, 4), match='bowl'>
<re.Match object; span=(0, 2), match='bo'>
<re.Match object; span=(0, 3), match='bow'>
I have researched that if this was the case but I couldn't find any explanation.
I'm not a regex expert, so I'll use simple words to describe what happens internally.
searchworks from left to right, and the|patterns too. Alsosearchis different frommatchand moves forward to try to find the pattern across the string, not just at start.Take this:
So if
opattern is tested against, since matcher is onbcharacter of the input string, it doesn't match, and the code tries the second pattern. If it failed, it would skip to next character (since all match possibilities are exhausted) and would matcho, but since it matches, it doesn't happen:bowlcharacters are consumed.If you try:
then
owill be matched.Note that it's not specific to python. That's how a correct regex engine works.
If you want the alternate behaviour you could write: