I have a regular expression (?<=^\d{0,6})(?!.*(\p{L})\1\1)\p{L}{6,11}$|^(?!.*(\p{L})\2\2)\p{L}{6,11}(?=\d{0,6}$) that matches these criteria :
A sequence of 6 to 11 alphabetical letters. This sequence can either stand alone, or :
- Be preceded only by 1 to 6 digits at the beginning of the line.
- Be followed only by 1 to 6 digits at the end of the line.
- Sequence should NOT be both preceded AND followed by digits.
- Sequence of 6 to 11 alphabetic letters should NOT contain more than 2 repeated letters, regardless of whether there are any digit(s) before or after.
I want to add a last criteria : digits before or after sequence of 6 to 11 alphabetic letters should NOT contain more than 4 repeated digits
Here are a sample of the inputs from my txt file :
- 0000048hghtrff7
- 295gdfinnnn548
- GJJfdDDDuhuBBB59
- 654bdfueogp48g4e
- 1852rhogkent4
- gihngpenitg
- 12reokgr8o5gLE
- FGJTfhds899
- 45954efreikLF
- 598Gfkoggge555yN
- Gdogrkrngo
- fmidjkydf1422222
I modified the regex by adding these assertion (?!([0-9])\1{3}) (?!([0-9])\2{3}) but this doesn't seem to do anything since the lines starting or ending with at least 5 times the same consecutive digits are still there.
(?<=^\d{0,6})(?!([0-9])\1{3})(?!.*(\p{L})\1\1)\p{L}{6,11}$|^(?!([0-9])\2{3})(?!.*(\p{L})\2\2)\p{L}{6,11}(?=\d{0,6}$)
How can achieve this ?