The following is an interesting algorithmic problem.
Inputs:
- The set of elements
S(say, non-negative integers). - A mapping
Nthat maps some elements fromSto a subsetS_k \subset S. (not all elements of S are represented, neither in keys nor in values). EachS_kconsists of unique, sorted elements only.
Example inputs:
S = {0, 1, 2, 3, 4, 5, 6, 7, 8}
N = {
0: [1, 2, 3],
4: [2, 3, 5, 6],
6: [5, 7, 8],
}
Let us denote by N[s] the set of elements corresponding to s (e.g., in the example above, N[0] = [1, 2, 3]).
The goal
Build a mapping M that maps keys in N to subsets of keys in N. The mapping should be of the form s_j: S_j where s_j \in S, S_j \subset S, and any s is included in the S_j if N[s_j] and N[s] have common elements.
Example result:
{
0: [4],
4: [0, 6],
6: [4],
}
The parameters are as follows:
- length of
N~100K elements - is the number of elements in the correspondence (
|N[s]|) ~10 - the number of unique elements (
|S|) ~100K
The following is my attempt at solving the task. It starts with building an inverted index, then iterating it to construct the result. Should be O(|N|) complexity but I'm unsure.
def merge_sets(sets: Sequence[Set]) -> Set:
"""List of sets in, a single merged set out. """
if not sets:
return set()
return reduce(or_, sets)
def neighbors_to_mutexes(
indexes: Union[List[int], np.ndarray],
neighbors_by_index: Union[List[List[int]], np.ndarray],
) -> Mapping[int, Set[int]]:
"""Given a list of indexes and a list of respective neighbors for each index,
find indexes whose neighbor lists share at least one element."""
inclusion_by_index = defaultdict(set) # which sets of neighbours is each index included in
neighbors_by_index = [set(neighbors) for neighbors in neighbors_by_index]
global_indexes = merge_sets(neighbors_by_index)
for index in global_indexes:
inclusion_by_index[index] = {
other_index
for other_index, neighbors in zip(indexes, neighbors_by_index)
if index in neighbors
}
intersecting = defaultdict(set)
for index, including_set in inclusion_by_index.items():
for i, j in product(including_set, including_set):
if i != j:
intersecting[i].add(j)
intersecting[j].add(i)
return intersecting
Kind of a graph problem with a bipartite graph, where each value in the lists is an intermadiate node on the way to other keys in N.