I am working on Jaro wrinkler similarity, and I am able to use between 2 columns, but how do I use it with 2 pairs of columns

194 Views Asked by Kevin D At 10 August 2022 at 22:00

Example i have 4 column in my dataframe, i want to use jaro similarity for col: A,B vs col: C,D containing strings

Currently i am using it between 2 columns using

df.apply(lambda x: textdistance.jaro(x[A], x[C]),axis = 1))

Currently i was comparing with names

|A|C |result| |--| --- | --- | |Kevin| kenny |0.67| |Danny |Danny|1| |Aiofa |Avril|0.75| I have records over 100K in my dataframe

COLUMN A -contains strings of person name

COLUMN B -contains strings of city

COLUMN C -contains strings of person name (to compare with)

COLUMN D -contains strings of city (to compare with)

Expected Output |A|B|C|D |result| |--|--|---| --- | --- | |Kevin|London| kenny|Leeds |0.4| |Danny |Dublin|Danny|dublin|1| |Aiofa|Madrid |Avril|Male|0.65|

Original Q&A

There are 1 best solutions below

Kevin D On 15 August 2022 at 10:48

df.apply(lambda x: textdistance.jaro(x['A'] + x['B'], x['C'] + x['D']),axis = 1))

thank you DarrylG

I am working on Jaro wrinkler similarity, and I am able to use between 2 columns, but how do I use it with 2 pairs of columns

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in JARO-WINKLER

Trending Questions

Popular # Hahtags

Popular Questions