Counting occurrence number of bits equal to 1 in two different bitsets

58 Views Asked by At

Suppose one has two lists of bitsets A=(a1,a2,...aN) and B=(b1,b2,...bN). Each bitset holds K bits.

Now, for each couple of bitsets (an,bn), one scans all the couples of indexes (i,j) where an[i]==1 and bn[j]==1 and increments a counter for such a couple (i,j).

At the end, one would get the number of times each couple of indexes (i,j) has been seen in the combination of bitsets A and B.

Example for N=4 and K=3:

    0 1 2       0 1 2
---------    --------
a1  0 0 1    b1 0 0 1
a2  1 0 1    b2 0 0 1
a3  0 0 0    b3 1 1 0
a4  1 1 0    b4 1 0 1

One should get the following counts:

(0,0)=1   (0,1)=0   (0,2)=2
(1,0)=1   (1,1)=0   (1,2)=1
(2,0)=0   (2,1)=0   (2,2)=2

Obviously, one can use a vector C of size K*K for storing the count for each (i,j). Then, one can iterate 1<=n<=N, make the "product" of bitsets an and bn in order to get relevant couples (i,j) and update the vector C accordingly.

Question: is there a better way to compute this counters compared to this naive approach ? Better here means both less memory usage and less computing operations.

Note that in my real use case, one would have N=10000 and K=256 for instance.

An important point is that the bitsets are quite sparse, which is perhaps an indication to reduce computation. An idea would be to begin the count for a given couple (i,j) only if it was seen at least once before (this could be done for instance with the help of a Bloom filter).

0

There are 0 best solutions below