Optimization of itertools for billion combinations

55 Views Asked by At

I deal with the generation of billion combinations using the itertools module in python. For example, I have A=[[0.0, 963.07438, 1926.14876], [0.0, 3203.76339, 6407.52678], [0.0, 3231.67715, 6463.3543]].

I need to generate a lot of combinations of A in a way: each element from sub-nest. For example, here k=(0.0, 0.0, 0.0), (0.0, 0.0, 3231.67715), (0.0, 0.0, 6463.3543), (0.0, 3203.76339, 0.0), (0.0, 3203.76339, 3231.67715), (0.0, 3203.76339, 6463.3543), (0.0, 6407.52678, 0.0), (0.0, 6407.52678, 3231.67715), (0.0, 6407.52678, 6463.3543), (963.07438, 0.0, 0.0), (963.07438, 0.0, 3231.67715), (963.07438, 0.0, 6463.3543), (963.07438, 3203.76339, 0.0), (963.07438, 3203.76339, 3231.67715), (963.07438, 3203.76339, 6463.3543), (963.07438, 6407.52678, 0.0), (963.07438, 6407.52678, 3231.67715), (963.07438, 6407.52678, 6463.3543), (1926.14876, 0.0, 0.0), (1926.14876, 0.0, 3231.67715), (1926.14876, 0.0, 6463.3543), (1926.14876, 3203.76339, 0.0), (1926.14876, 3203.76339, 3231.67715), (1926.14876, 3203.76339, 6463.3543), (1926.14876, 6407.52678, 0.0), (1926.14876, 6407.52678, 3231.67715), (1926.14876, 6407.52678, 6463.3543)

Thus, I have N^k combination, here N=3,k=3 ->27.

I can do it using

sums = ((vs,sum(vs)) for vs in itertools.product(*A))

But, I need to reduce this k with saving only the combinations with the specific conditions: abs(v-12000)<(2000).
For, example here, I have only 6 combinations: (0.0, 6407.52678, 6463.3543), (963.07438, 3203.76339, 6463.3543), (963.07438, 6407.52678, 3231.67715), (963.07438, 6407.52678, 6463.3543), (1926.14876, 3203.76339, 6463.3543), (1926.14876, 6407.52678, 3231.67715) satisfying this condition.

Finally, I have this code:

sums = ((vs,sum(vs)) for vs in itertools.product(*A))

for k,v in sums:
    
    if (abs(v-12000)<(2000)):
        print k

It takes a lot of time, when I have N=108 and k=10 (billion combinations).

How can I speed up or optimize this code? Maybe there is alternative to itertools?

Thanks

1

There are 1 best solutions below

0
Stef On

This is a task for dynamic programming; you can build a table using the recurrence formula:

combs[i, t] = list(itertools.chain.from_iterable(
    ((*c, j) for c in combs[i-1, t-v])
    for j,v in enumerate(A[i-1])
))

to build a table; where combs[i,t] is the list of tuples of length i using one element from each of the i first sublists of A, such that these elements sum to t (plus or minus your tolerance tol=2000).

You can initalise the edges of the table:

combs[0, t] = [()] if t <= tol
combs[0, t] = [] if t > tol
combs[i, t] = [] if t + tol < 0
combs[i, t] = [] if t - tol > i * max_element_of_A # this can be refined further using itertools.accumulate on the list of maximums of the lists of A

The table will have:

  • one row for each possible value of i (between 0 and k);
  • one column for each possible value of t (between 0 and 12000, but you can round all elements in A to units or tens or hundreds to decrease the number of possible values)