How to Capture all groups with a dot followed by a space NOT preceeded by two given strings in Google sheets Regex?

76 Views Asked by At

I could not find a duplicate question.

I need to capture groups of period followed by a space. But only those groups than are not preceeded by those two strings: Zn and Zns.

I tested this:

[^Z][^n](\.\s)

https://regex101.com/r/ARZxAV/1

And this:

([^Z][^n]|[^Z][^n][^s])(\.\s)

https://regex101.com/r/cvEUKM/1

And this:

(?!Zn)(\.\s)

https://regex101.com/r/IUZeso/1

On this sample:

1) aaa. Zn. bbb
2) aaa. Zn. bbb.
3) aaa Zn. bbb.
4) aaa Zn. bbb
5) aaa Bns. bbb
6) aaa Bns. Bbb
7) aaa Zn. Aaa Bns. Bbb
8) 1
9) aaa
10) aaa.
11) aaa Abbb
12) aaa. Zns. bbb
13) aaa. Zns. bbb.
14) aaa Zns. bbb.
15) aaa Zns. bbb

How to capture only the a. , b. and s. (of the aaa. bbb. Bns. groups in the sample) groups from the sample above, excluding the Zn. and Zns. ?

Expected output from sample in bold:

  1. aaa. Zn. bbb
  2. aaa. Zn. bbb.
  3. aaa Zn. bbb.
  4. aaa Zn. bbb
  5. aaa Bns. bbb
  6. aaa Bns. Bbb
  7. aaa Zn. Aaa Bns. Bbb
  8. 1
  9. aaa
  10. aaa.
  11. aaa Abbb
  12. aaa. Zns. bbb
  13. aaa. Zns. bbb.
  14. aaa Zns. bbb.
  15. aaa Zns. bbb
3

There are 3 best solutions below

4
rockinfreakshow On

non-regex formula for Sheets:

=let(Σ,split(A2," "),
 ifna(filter(Σ,right(Σ)=".",iserror(xmatch(Σ,{"Zn.","Zns."})))))
  • Use join() if you want all the output groups accommodated in single cell

enter image description here

arrayformula style:

=map(A2:A,lambda(Λ,if(Λ="",,let(Σ,split(Λ," "),join(" ",ifna(filter(Σ,right(Σ)=".",iserror(xmatch(Σ,{"Zn.","Zns."})))))))))

enter image description here

3
The fourth bird On

You might use a regex with a custom function.

\b(?!Zns?\b)[a-zA-Z]+\.\s

The pattern matches:

  • \b A word boundary to prevent a partial match
  • (?!Zns?\b) Negative lookahead, assert not Zn or Zns directly to the right
  • [a-zA-Z]+ match 1+ chars a-zA-Z
  • \.\s Match . and a whitspace char (which could also match a newline)

See a regex demo.

For example:

function myFunction(input) {
    const regex = /\b(?!Zns?\b)[a-zA-Z]+\.\s/g;
    const res = Array.from(input.matchAll(regex), m => m[0]);
    return res.length ? [res] : [[""]];
}

enter image description here

enter image description here

1
z.. On

Here's another solution

=ARRAYFORMULA(SPLIT(REGEXREPLACE(A1:A15,"Zns?\.|([A-Za-z]+\.)|.","$1 ")," "))