Find frequency of any element in given (+ve) integer Array

76 Views Asked by At

Let, I have an array of size N and elements of array are denoted by Array[i] where i is index here,

Now I need to find out whether any element in the given array occurs a specific amount of time or not?

Conditions are given as follows : **Array[i] is much greater than N** for all i belongs to N (you can think as element of array is thousand times or even more greater than the size of the arraay) **Time Complexity = O(N) Space Complexity = O(1)**

Example= ARRAY[] = {1111, 2222, 3333, 2222, 3333, 3333, 3333, 1111, 2222, 3333}

now is it possible to find out if any element occurs more than 4 times in given array by traversing the given array once / twice / thrice / ....

1

There are 1 best solutions below

0
btilly On

This requires destroying the array as we analyze it.

We will evaluate n/7 of the values in a single pass. We will do it by using the first 6n/7 of the array as a Cuckoo hash as follows.

  1. We divide that section into pairs of elements.
  2. If a given pair (i,j) has j <= n then j is not a value and that is an empty bucket.
  3. If a given pair (i,j) has j not in one of the 2 locations where it would go in the hash, that's an empty bucket.
  4. Assuming it is a full bucket, if n < i, then j appears in the hash once.
  5. Assuming it is a full bucket, if n <= i, then j appears in the hash i times.

Note that the first 6n/7 of the array will be treated as 3n/7 buckets, which is 3x the n/7 values we are trying to hash. Therefore our load factor is 1/3 < 1/2 and Cuckoo hashing works.

I'll just outline it from here. But this is the idea.

We'll need a counter inserted_values for how many distinct values are in the hash.

We initialize inserted_values by scanning the hash buckets to see how many happen to have a value in a correct place to be a full bucket. Note, if a value appears in both places it could be, only the one for the first hash value is correct.

And now we start from the end of the array and move towards the front. As we look at each value, we go as follows.

idx = len(array)
while 0 < idx:
    idx -= 1
    current_value = array[idx]
    if current_value is <= n:
        pass
    elif current_value is correctly placed in the hash:
        pass
    elif current_value can be found in the hash:
        if current_value only appears once:
            swap where count will go with where current_value is
            set count to 2
            idx += 1
        else:
            increment count
            set current_value to 0
    elif inserted_values < n/7:
        swap current value with where it belongs in the hash
        counter += 1

(NOTE: I left out rehashing. That adds considerable complications, but the principle is the same.)

And now we can scan the hash buckets, to find n/7 of the values with their counts. If any has a count meeting your condition, you have the answer. Otherwise 0 out the counts and processed values.

Each stage processes n/7 of the hash, so at most 7 are needed. Each requires 2 passes, so total of 14 passes. This is O(n) of work (I left some complications around rehashing out, so let's just say with very high likelihood O(n) work) to produce your answer.

It is using the array itself to track everything. In addition you need O(1) memory, so it meets the stated requirements.