find number of unique durations given a list of durations and an upper bound

63 Views Asked by At

Lets say we are given a list of durations (5s, 10s, 10s, 15s, 15s, 15s, 25s, 30s....) and we want to find a list of unique durations that can be created using this list of single durations.

for example if we have the original list of (5, 5, 15, 25) we can create the following durations:

  • 5 - using one 5 element
  • 10 - using two 5 elements
  • 15 - using one 15 element
  • 20 - using one 5 element and one 15 element
  • 25 - using two 5 elements and one 15 element OR using the 25 element
  • 30 - using 25 and 5
  • 35 - using 25 and two 5s
  • 40 - using 25 and 15
  • 45 - using 25, 15 and 5
  • 50 - using all elements

As a bonus I want to set an upper limit. For example I want to use a an upper bound for the total duration of 37.

This should eliminate the options of 40, 45, and 50 because they are above the limit.

PS, durations are very frequently repeated in the original list, so maybe there's an optimisation possible there?

Anyone know of a way to approach this?

I have used combinatorics libraries to find all possible combinations using the element limit, and eliminate ones that are above the max limit, but the number grows with factorial complexity which made the solution way too slow.

3

There are 3 best solutions below

4
MBo On BEST ANSWER

Make a table T of size upper limit + 1, put 1 into T[0], or define set containing zero value. Then for every value v (duration) check for nonzero entries of table (say index i) and put 1 into v+i cell.

Python example with set:

a = [4,13,29]
lim = 37
t = {0}
for v in a:
    for i in range(lim - v + 1):
        if i in t:
            t.add(i + v)
t.remove(0)
print(len(t), t)

>> 19  {4, 8, 12, 13, 16, 17, 20, 21, 24, 25, 26, 28, 29, 30, 32, 33, 34, 36, 37}
1
Unmitigated On

You can solve this in O(n * limit) with dynamic programming where n is the number of elements in the list and limit is the upper bound.

For example, in Java:

final int limit = 37;
final List<Integer> durations = List.of(5, 10, 10, 15, 15, 15, 25, 30);

var possible = new boolean[limit + 1];
// possible[i] indicates if it is possible to create the duration i
possible[0] = true; // base case

for (int duration : durations)
    for (int i = limit; i >= duration; i--)
        possible[i] |= possible[i - duration];

List<Integer> possibleDurations = new ArrayList<>();
for (int i = 1; i <= limit; i++)
    if (possible[i]) possibleDurations.add(i);
1
Dillon Davis On

The two existing solutions tackle the problem with classic DP, with presumably optimal time complexity. I want to discuss some different ideas, which may add up to a logarithmic factor in the worst case, but may in fact be faster in practice.

The first approach is to use an interval tree to consolidate consecutive reachable sums into a single interval, meaning that for "dense" solutions, the solver will run significantly faster.

Start with an empty interval tree, and insert the half-open interval [0, 1). Then, for each value in your list V, walk over the intervals in the tree right (larger) to left (smaller). For an interval [S, E), if S+V is greater than limit L, skip to the next interval. Otherwise insert [V+S, min(V+E, L)) into the tree, and merge the interval with any overlapping (or touching / adjacent) interval (just delete and insert). At the end, the intervals you are left with span the values you can sum to with your list values. As your values you can sum to become more and more dense, the intervals will merge, resulting in fewer total intervals to iterate over (and thus, insert) per value in the remaining list. For optimal results, you should probably also sort your list in non-decreasing order, to maximize early merging of intervals.

This still adds a log(L) factor to the worst case time complexity of course, for the queries / inserts / deletions in the tree, but that's certainly worth it for the speedup for dense solutions specifically.


The second approach is to formulate the problem mathematically, and then use some trickery to simplify the expression for our desired result.

For each duration in your list we chose between add the element or not (to add zero). Imagine for a second the list [3, 7], this gives 0; 3; 7; 10, or written out a bit more explicitly 0 + 0; 3 + 0; 0 + 7; 3 + 7. Now imagine that we encode this information in a polynomial, specifically in the exponent. We would express our list, or rather our choices between 0?3 and 0?7 as:

P(x) = (x^0 + x^3) * (x^0 + x^7)
P(x) = x^(0+0) + x^(3+0) + x^(0+7) + x^(3+7)
P(x) = x^0 + x^3 + x^7 + x^10

Note that the exponents in our expanded polynomial are exactly the possible sums. In fact, the coefficient on each term is the count of how many times that sum (the exponent) is produced.

If we write out our list as the product of (1 + x^a) terms and simplify, we'll have our answer- the terms with non-zero coefficients are the sums we can make. To do this efficiently we can use the fast fourier transform (FFT) to perform polynomial multiplication (in O(n log n) time for two degree n polynomials). We repeat this polynomial multiplication recursively until we have the product of all terms, in O(n log^2 n) time.

However, we have an upper limit we can utilize, so if at each multiplication we reduce each polynomial to just its lower order terms (under the limit), we'll avoid computing higher order terms we don't need. For upper limit L and a list of N numbers, we'd have O(NL log L) time to arrive at the final product. However, this is assuming the worst case that all terms will be order L, which is likely not the case. What you can do is use basic O(N^2) polynomial multiplication to multiply out the polynomials with just a few terms (which will be faster than the FFT anyways due to the large constant factor), and then switch to using the FFT for polynomial multiplication at some threshold. In the case where most elements are really small and the sum of the values in the list is itself comparable in size to the limit L, it'll behave more like O(L log^2 L), and would beat the DP solution.

As a further optimization, when writing out the (1 + x^a) terms, you can combine like terms, either just raising component to the appropriate power after taking the FFT, or choosing a different polynomial entirely (1 + x^a + x^(2a) + ... + x^(ka)), and then rewriting with the formula for a geometric series to (1 - (x^a)^(k+1)) / (1 - x^a).


Both of these approaches are worse by a logarithmic factor in the generic case asymptotically speaking, however for practical inputs each could be faster in its own case. Under optimal conditions they can extremely fast; their time complexities are bounded tightly below by Ω(N) and Ω(L log^2 L) respectively, far better than a fixed Ω(NL).