There are two possible data contexts for this question. First, consider two columns of data like:
Column A | Column B |
---|---|
1 | 3 |
1 | 4 |
6 | 5 |
2 | 1 |
Where the numbers represent unique identifiers. My objective is to create a list where each element of the list is a grouping of all elements that are connected together by appearing in the same row with another element somewhere. In the example above, 1,3,4, and 2 are connected and belong to 1 group while 5,6 belong to another group. So the output I would like is:
[[1]]
[1,3,4]
[[2]]
[5,6]
Alternatively, instead of columns, I could have a list of smaller chunks of groups. For example:
[1,3], [1,4], [1,4,2], [6,5].
My objective and desired output here is the same.
I am currently running my solution for this problem but it is very brute force. I am using the 2nd setup where I have a list of groups. I created a loop where for each group, it goes through the list and absorbs other groups that have common elements with it. This seems to be working but 1.) is slow (I am just 5% through as of this writing) and 2.) I will still need to identify and remove duplicate groups after.