remove duplates from multi lists at same index

32 Views Asked by At

I know the built-in function np.unique() for removing all item duplicated in an array list in python, or convert the array list in a dict and to an array list.

But, the problem I have is the following:

I have for example 3 lists: l1 = ['a', 'b', 'c', 'd', 'e', 'f', 'a', 'j', 'a'] l2 = ['b', 'a', 'b', 'd', 'e', 'f', 'b', 'j', 'b'] l3 = ['c', 'a', 'a', 'd', 'e', 'f', 'c', 'j', 'c']

I would like to know if it exists a built-in function to remove duplicates "a, b, c" a in list l1 b in list l2 c in list l3 and "a, b, c" are in same index in the example, it should remove items at index 8 and 6 on the 3 lists.

thanks with your help.

I would like to know if it exists a built-in function to remove duplicates "a, b, c" a in list l1 b in list l2 c in list l3 and "a, b, c" are in same index in the example, it should remove items at index 8 and 6 on the 3 lists.

thanks with your help.

2

There are 2 best solutions below

0
Michael Cao On

If I understand your question correctly, each list while having different unique values, would return the same unique_indices and unique_counts when running np.unique on each list. We can leverage this by only running np.unique on one list with return_counts and return_index set to True, and then use its output.

uni, ind, cts = np.unique(l1, return_counts = True, return_index = True)
no_repeat_l1 = np.array(l1)[ind[cts == 1]]
no_repeat_l2 = np.array(l2)[ind[cts == 1]]
no_repeat_l3 = np.array(l3)[ind[cts == 1]]
0
Andrej Kesely On

IIUC, you can do:

l1 = ["a", "b", "c", "d", "e", "f", "a", "j", "a"]
l2 = ["b", "a", "b", "d", "e", "f", "b", "j", "b"]
l3 = ["c", "a", "a", "d", "e", "f", "c", "j", "c"]

out, found = [], False
for t in zip(l1, l2, l3):
    if t == ("a", "b", "c"):
        if found is False:
            out.append(t)
            found = True
    else:
        out.append(t)

l1, l2, l3 = map(list, zip(*out))
print(f"{l1=}\n{l2=}\n{l3=}")

Prints:

l1=['a', 'b', 'c', 'd', 'e', 'f', 'j']
l2=['b', 'a', 'b', 'd', 'e', 'f', 'j']
l3=['c', 'a', 'a', 'd', 'e', 'f', 'j']