I have a few lists that havethe same IDs that are strings. They are as follows:
list1 = ["1", "2", "3", "4", "5"]
list2 = ["1", "2", "3", "5", "4"]
list3 = ["1", "5", "4", "3", "2"]
list4 = ["4", "2", "5", "3", "1"]
What measure can I use to determine the lists that are closest to each other here in terms of order? Ideally list1 and list2 should be the closest here.
Does the spearman correlation make sense here?
Edit distance seems to be a good candidate for such a metric.
prints
which matches your intuition.
Note that in the Python code I use the Levenshtein distance where insert, delete, and replace operations are allowed. You can, of course, use other types of edit distance.