I have a problem finding rows of a DataFrame where 2 Columns contain a Part of a given String s.
The Column Values (Type String(Object))
I mean about the opposite of str.contains or isin(), because the substring-mask is the Column-Value.
The String is not suitable for clear splitting, because the 3 Values "Cityname", "Districtname" and "Streetname" can contain Whitespaces.
Can you help me?
s = "Bad Testcity Teststr."
df_res = df.loc[(s.find(df['CITY'] != -1) & (s.find(df['DISTRICT'] != -1) & (s.find(df['STREET'] != -1)]`
This sample should return TRUE.
<bound method DataFrame.info of ZIP CITY STREET NUMBER NUMBER_SFX DISTRICT ONKZ ASB ADSL VDSL VDSL_SV VPSZ OUTDOOR
ID
4025217 12345 Bad Testcity Teststr. 6 NaN Bad Testcity 12345 2 +017.696 +102.784 NaN 49/12345/30 O
4025219 12345 Bad Testcity Teststr. 7 NaN Bad Testcity 12345 2 +017.696 +102.784 NaN 49/12345/30 O
4025242 12345 Bad Testcity Teststr. 8 NaN Bad Testcity 12345 2 +017.696 +102.784 +185.824 49/12345/30 O
4025244 12345 Bad Testcity Teststr. 10 NaN Bad Testcity 12345 2 +017.696 +102.784 NaN 49/12345/30 O
4025245 12345 Bad Testcity Teststr. 11 NaN Bad Testcity 12345 2 +017.696 +051.392 NaN 49/12345/30 O
... ... ... ... ... ... ... ... .. ... ... ... ... ...
[1569530 rows x 13 columns]>
Assuming this input:
You can concatenate the columns with a space, then use
str.containson the output:Less efficient alternative (can nevertheless be useful):
Output (here unchanged):