string1 = 'Department of the Federal Treasury "IFTS No. 43"'
string2 = 'Federal Treasury Company "Light-8"'
I need to get the first capital letters of words longer than 3 characters that are before the opening quote, and also extract the quoted expression using a common pattern for 2 strings.
Final string should be:
- for
string1:'IFTS No. 43, DFT'. - for
string2:'Light-8, FTC'.
I would like to get a common pattern for two lines for further use of this expression in DataFrame.
You can use a capturing group and alternation.
See this demo at regex101 (FYI read: The Trick)
It either matches the quoted parts and captures negated double quotes
"inside"to the first capturing group OR matches each capital letter at an initial\bword boundary (start of word).Python demo at tio.run >