PCRE2 (PHP>=7.3) Positive Lookbehind Regex to search for strings separated by ","

189 Views Asked by At

I have this string:

'newProductsInfo: [[1] - $2 dollars,[2] New Product,[3] Hello,[4]Same, [5]Value]'

The word 'newProductsInfo', and a space ' ' always precede the array of strings. I want to be able to return

[1] - $2 dollars

[2] New Product

[3] Hello

[4]Same, [5]Value //should be returned since the comma is followed by a space ' ' 

in the Regex101 site.

Currently, using (?<=newProductsInfo: \[)[^,\]]+ only returns [1.

enter image description here

Edit: added possible return types in bubble plugin creator:

enter image description here

2

There are 2 best solutions below

19
The fourth bird On BEST ANSWER

Your pattern (was tagged JavaScript) only matches [1 because the negated character class [^,\]]+ can not cross matching a comma or ]

If you want the matches only, you can assert newProductsInfo: to the left followed by any character

Start the match with digits between square brackets up until either the next occurrence preceded by a comma, or the end of the string.

Using a lookbehind assertion (see the support) for it:

(?<=newProductsInfo: .*?)\[\d+].*?(?=,\[\d+]|$)

Regex demo

Edit

If you want to use PHP:

(?:newProductsInfo: \[|\G(?!^)),?\K[^,\n]+(?>,\h+[^,\n]+)*

Explanation

  • (?: Non capture group for the alternatives
    • newProductsInfo: \[ Match newProductsInfo: [
    • | Or
    • \G(?!^) Assert the current position at the end of the previous match, not at the start
  • ) Close the non capture group
  • ,?\K Match an optional comma and forget what is matched so far
  • [^,\n]+
  • (?>,\h+[^,\n]+)* Optionally repeat matching a comma, 1+ spaces and 1+ chars other than , or newline

Regex demo

0
Tim Biegeleisen On

Here is one approach using match():

var input = 'newProductsInfo: [[1] - $2 dollars,[2] New Product,[3] Hello,[4]Same, [5]Value]';
var matches = input.replace(/^.*?\[(?=\[\d+\])|\]$/g, "")
                   .split(/,(?=\[\d+\])/);
console.log(matches);

The call to replace() strips off the leading/trailing [ and ]. The call to split() splits the input string at every comma which is followed by a numbered bracket term such as [2].