How to achieve recordlinkage functionality in Pyspark ??? I want to do a similarity check between Dataset1 Name and Dataset 2 Name.
Please help suggest me if any library available for pyspark.
I try with the recordlinkage library of pyhton but it is working with pandas dataframe.
Splink is the best option that I know of.