I am unable to understand how does the from sklearn.metrics import mutual_info_score package works?
so I have a list which is representative of the arms of a maze the numbers can be between 1-8. But the numbers don't have weightage and all are equal, they just represent the arm number. They are from two different conditions let's say A and B.
Suppose the arms visited in condition A, are stored in list_a and in condition, B are stored in list_b.
I want to find out if there is any mutual information between the two conditions, say if list_a = [1,2,3,4,5,6,7] and list_b = [1,2,3,4,5,6,7] then mutual information = 1 and if list_a = [1,2,3,4] and list_b = [5,6,7,8] then mutual information = 0.
I use this code:
from sklearn.metrics import mutual_info_score
list_a = [1,2,3,4,5,6,7]
list_b = [1,2,3,4,5,6,7]
mutual_info = mutual_info_score(list_a, list_b)
however, I get different values than I expected for mutual information= 1.94
I am not able to understand how it computes mutual_info_score?
is this the correct way to compute mutual_info?
PS: I am not sure where to post this but since it has Python modules am putting it here.
edit: list_a and list_b could be of variable lengths so comapring length is not desired.
Doesn't this come pretty close to what you're after? This essentially computes the ratio between the length of the union of the two sets and the sum of the lengths of the two sets.
Output: