How to match complete words for acronym using regex?

51 Views Asked by At

I want to only get complete words from acronyms with ( ) around them.

For example, there is a sentence 'Lung cancer screening (LCS) reduces NSCLC mortality'; ->I want to get 'Lung cancer screening' as a result.

How can I do it with regex?


original question: I want to remove repeated upper alphabets : "HIV acquired immunodeficiency syndrome are at a particularly high risk of cervical cancer" => " acquired immunodeficiency syndrome are at a particularly high risk of cervical cancer"

2

There are 2 best solutions below

2
Tim Biegeleisen On

Assuming you want to target 2 or more capital letters, I would use re.sub here:

inp = "Lung cancer screening (LCS) reduces NSCLC mortality"
output = re.sub(r'\s*(?:\([A-Z]+\)|[A-Z]{2,})\s*', ' ', inp).strip()
print(output)  # Lung cancer screening reduces mortality
0
Mouayad_Al On
import re
s = 'HIV acquired immunodeficiency syndrome are at a particularly high risk of cervical cancer'
print(re.sub(r'([A-Z])', lambda pat:'', s).strip()) # Inline

according to @jensgram answer