I have this string:
s='''
D. JUAN:
¡Cálmate, pues, vida mía!
Reposa aquí; y un momento
olvida de tu convento
la triste cárcel sombría.
¡Ah! ¿No es cierto,
ángel de amor,
que en esta apartada orilla
más pura la luna brilla
y se respira mejor?
'''
If I want all the words strarting with a vowel:
import re
print(re.findall(r'\b[aeiouAEIOU]\w*\b', s))
and the output is:
['aquí', 'un', 'olvida', 'Ah', 'es', 'amor', 'en', 'esta', 'apartada', 'orilla']
Now, I try to list all words that do not start with a vowel:
print(re.findall(r'\b[^aeiouAEIOU]\w*\b', s))
and my output is:
['D', 'JUAN', 'Cálmate', 'pues', 'vida', ' mía', 'Reposa', ' aquí', 'y', ' un', ' momento', '\nolvida', ' de', ' tu', ' convento', '\nla', ' triste', ' cárcel', ' sombría', 'No', ' es', ' cierto', 'ángel', ' de', ' amor', 'que', ' en', ' esta', ' apartada', ' orilla', '\nmás', ' pura', ' la', ' luna', ' brilla', '\ny', ' se', ' respira', ' mejor']
The
[^aeiouAEIOU]negated character class matches any character other thana,e,i,o,u,A,E,I,OandU, so a linefeed char, or a§will also be matched if they are preceded with a word character (a letter, digit or underscore in most cases) as the negated character class is preceded with a\bconstruct.So, you need to use
where
(?![aeiouAEIOU])negative lookahead will make sure the\w+only matches one or more word chars where the first char is not equal to the letter inside the character class.See the regex demo (note that you must select the right engine in the regex101 options).
Note you do not need any
\bat the end after\w+, since the word boundary is implied at that position.