Add space between Persian numeric and letter with python re

281 Views Asked by At

I want to add space between Persian number and Persian letter like this:

"سعید123" convert to "سعید 123"

Java code of this procedure is like below.

str.replaceAll("(?<=\\p{IsDigit})(?=\\p{IsAlphabetic})", " ").

But I can't find any python solution.

3

There are 3 best solutions below

0
revo On BEST ANSWER

There is a short regex which you may rely on to match boundary between letters and digits (in any language):

\d(?=[^_\d\W])|[^_\d\W](?=\d)

Live demo

Breakdown:

  • \d Match a digit
  • (?=[^_\d\W]) Preceding a letter from a language
  • | Or
  • [^_\d\W] Match a letter from a language
  • (?=\d) Preceding a digit

Python:

re.sub(r'\d(?![_\d\W])|[^_\d\W](?!\D)', r'\g<0> ', str, flags = re.UNICODE)

But according to this answer, this is the right way to accomplish this task:

re.sub(r'\d(?=[آابپتثجچحخدذرزژسشصضطظعغفقکگلمنوهی])|[آابپتثجچحخدذرزژسشصضطظعغفقکگلمنوهی](?=\d)', r'\g<0> ', str,  flags = re.UNICODE)
1
Rakesh On

I am not sure if this is a correct approach.

import re
k = "سعید123"
m = re.search("(\d+)", k)
if m:
    k = " ".join([m.group(), k.replace(m.group(), "")])
    print(k)

Output:

123 سعید
0
Wiktor Stribiżew On

You may use

re.sub(r'([^\W\d_])(\d)', r'\1 \2', s, flags=re.U)

Note that in Python 3.x, re.U flag is redundant as the patterns are Unicode aware by default.

See the online Python demo and a regex demo.

Pattern details

  • ([^\W\d_]) - Capturing group 1: any Unicode letter (literally, any char other than a non-word, digit or underscore chars)
  • (\d) - Capturing group 2: any Unicode digit

The replacement pattern is a combination of the Group 1 and 2 placeholders (referring to corresponding captured values) with a space in between them.

You may use a variation of the regex with a lookahead:

re.sub(r'[^\W\d_](?=\d)', r'\g<0> ', s)

See this regex demo.