How to drop and keep only certain non alphanumeric characters?

159 Views Asked by RustyShackleford At 28 January 2019 at 16:21

I Have df that looks like this:

email                                    id
{'email': ['[email protected]']}           {'id': ['123abc_d456_789_fgh']}

when I drop non alphanumeric characters like so:

df.email = df.email.str.replace('[^a-zA-Z]', '')
df.email = df.email.str.replace('email', '')


df.id = df.id.str.replace('[^a-zA-Z]', '')
df.id = df.id.str.replace('id', '')

The columns look like this:

email                    id
testtestcom              123abcd456789fgh

How do I tell the code to not drop anything in the square brackets but drop all non alpha numeric characters outside the brackets?

New df should like this:

email                        id
[email protected]                123abc_d456_789_fgh

Original Q&A

There are 2 best solutions below

Gianmar On 28 January 2019 at 17:16 BEST ANSWER

This is hardcoded, but works:

df.email = df.email.str.replace(".+\['|'].+", '')
df.id = df.id.str.replace(".+\['|'].+", '')

>>> '[email protected]'
>>> '123abc_d456_789_fgh'

The fourth bird On 28 January 2019 at 16:27

According to the comments, what you might do is capture what is in between the square brackets in a capturing group.

In the replacement use the first capturing group.

\{'[^']+':\s*\['([^][]+)'\]}

That will match

\{ Match {
'[^']+' Match ', then not ' 1+ times
: Match literally
\s*\[' Match 0+ times a whitespace character and then [
([^][]+) Capture group, match not [ or ]
'\] Match ]
} Match literally

Regex demo | Python demo

How to drop and keep only certain non alphanumeric characters?

There are 2 best solutions below

Related Questions in PYTHON-3.X

Related Questions in PANDAS

Related Questions in REPLACE

Related Questions in NON-ALPHANUMERIC

Trending Questions

Popular # Hahtags

Popular Questions