Complexity of Overlaps SortedSet<T>

210 Views Asked by At

in the Microsoft documentation, the complexity is indicated as O(n)

but if you look at the implementation

  foreach (T item in other)
            {
                if (Contains(item))
                {
                    return true;
                }
            }

, then the search method is called for each element, which gives the complexity O(m*log(n))

who is really right?

2

There are 2 best solutions below

0
Guru Stron On

This is mistake in the docs, if you check docs for Contains (and it's implementation using FindNode) you will see:

This method is an O(log n) operation.

Which makes Overlaps to be O(n log m) where m - number of elements in the "this" collection and n is the number of elements in other.

Created PR for the docs.

UPD

PR was merged and now docs say:

This method is an O(n log m) operation, where m is Count and n is the number of elements in other.

0
StriplingWarrior On

You are correct. The documentation is wrong.

Similar methods like IsSubsetOf leverage the O(1) GetViewBetween method to achieve O(n) complexity when comparing two SortedSets with the same equality comparer.

Overlaps also has logic to optimize this case too, making it an O(1) operation if the two sets' min and max values prove that they can't overlap at all. In theory, Overlaps could apply an additional optimization to be O(n) when compared to another sorted set with overlapping ranges, but it doesn't.

In all other cases, as you've pointed out, Overlap resorts to a loop with a Contains check, and Contains is an O(log n) operation, making the loop (on average, assuming many non-matching values) O(n * log m) where n is the number in the other collection and m is the number in this collection.