Regular expression to match up to the first space of a line not preceeded by a comma

51 Views Asked by M1KEMEX At 24 March 2024 at 06:48

I'm editing a large dictionary file and the term and definition pairs do not have a consistent format. Some words are "simple", some words include the base term plus some suffix to alter things like its gender, basically stacking two terms into one entry:

abacora (definition)
abacorar  (definition)
abad, desa (definition)

This last term means "abad" and "abadesa" (feminine variant).

I've been trying to write the regular expression to capture this "peculiarity" but I can't seem to make it work. This matches the first part of the term fine, but fails to capture the second part:

^[^\s(?<!,)]+

It should return:

"abacora"
"abacorar"
"abad, desa"

Original Q&A

There are 2 best solutions below

Tim Biegeleisen On 24 March 2024 at 06:55

I would use the following pattern, which should capture all leading words possibly including a CSV list:

^\w+(?:,\s*\w+)*

This pattern says to match:

^ from the start of the line
\w+ match a word
(?:,\s*\w+)* optionally followed by a CSV list of other words

Demo

Edit:

More generally, we can match on [^,\s]+ for a non whitespace, non comma, character, and use this pattern:

^[^,\s]+(?:,\s*[^,\s]+)*

Demo

Nick On 24 March 2024 at 06:55

Your regex is just a character class which will match anything other than whitespace or one of (, ?, <, !, , or ). What you need to do is match up to a space which is not preceded by a comma, which could do with this regex:

^(?:, |[^ ])+

This matches:

(?:, |[^ ])+ : one or more of either:
- , : a comma followed by a space; or
- [^ ] : a character which is not a space

Regex demo on regex101

Regular expression to match up to the first space of a line not preceeded by a comma

There are 2 best solutions below

Demo

Demo

Related Questions in REGEX

Related Questions in DICTIONARY

Related Questions in TEXT

Related Questions in MULTIPLE-ENTRIES

Related Questions in PCRE2

Trending Questions

Popular # Hahtags

Popular Questions