Example i have 4 column in my dataframe, i want to use jaro similarity for col: A,B vs col: C,D containing strings

Currently i am using it between 2 columns using

df.apply(lambda x: textdistance.jaro(x[A], x[C]),axis = 1))

Currently i was comparing with names

|A|C |result| |--| --- | --- | |Kevin| kenny |0.67| |Danny |Danny|1| |Aiofa |Avril|0.75| I have records over 100K in my dataframe

COLUMN A -contains strings of person name

COLUMN B -contains strings of city

COLUMN C -contains strings of person name (to compare with)

COLUMN D -contains strings of city (to compare with)

Expected Output |A|B|C|D |result| |--|--|---| --- | --- | |Kevin|London| kenny|Leeds |0.4| |Danny |Dublin|Danny|dublin|1| |Aiofa|Madrid |Avril|Male|0.65|

1

There are 1 best solutions below

1
Kevin D On

df.apply(lambda x: textdistance.jaro(x['A'] + x['B'], x['C'] + x['D']),axis = 1))

thank you DarrylG