Cosine similarity between multiple text columns of dataframe and list of names

485 Views Asked by At

HI I am looking calculate cosine similarity between multiple text columns of a dataframe with a list of name to return a best match and similarity score. Also looking to return true false based on score based on similarity threshold.

Example Data looks like below,

#df1

name1         name2      name 3
mahesh        suresh     suvarna
suresh        suresh     gv rao
suvarna       gv rao     ravi
kumar varma   Varma      suvarna
ravi shankar  robert     peter
d man mohan   kumar      man mohan

#df2 or Name List

white_list
suresh
ram
rao gv
kumar varma
sameer
d mohan

#Expected output

Best_match  Score   result
Mahesh      0.85    TRUE
Ravi Kumar  0.32    FALSE
Suresh      0.48    FALSE
Varma       0.52    FALSE
Sameer      0.32    FALSE
Mohan       0.81    TRUE

Can someone please help me to do this.

0

There are 0 best solutions below