In a sorted array of prime numbers arr find index i of the smallest number in arr such that arr[i] divides given number

144 Views Asked by At

Consider the following data:
arr = [2,3,5,7,11,13,17,19]
n = 100
Output of the given should be 0 since n % 2 = 0 and arr[0] = 2.

Since array is sorted I want to use that to my advantage and design an algorithm that can work in a complexity smaller than O(n), which you have by iterating over the whole array and finding first i such that n % arr[i] = 0.

I instantly thought of using modified Binary Search with the same complexity as the regular Binary Search(O(logN)), which does the same as a regular Binary Search which divides the search interval in half in each iteration with slight modification. Here is the implementation i used:

public int SmallestFactorIndex(int n)
{
    int left = 0;
    int right = max;
    int index = -1;

    while (left <= right)
    {
        int middle = (right - left) / 2 + left;
        int middleValue = sieve[middle];

        if (n % middleValue == 0)
        {
            index = middle;
            right = middle - 1;
        }
        else if (middleValue < n)
        {
            left = middle + 1;
        }
        else
        {
            right = middle - 1;
        }
    }

    return index;
}

Max represents length of the array - 1 and sieve is the sorted array of prime numbers. Only addition to the Binary Search is that if the current number does divide n, I dont instantly return that index but continue searching for a smaller number than that current number by moving the right border to the middle one. All other parts work pretty much the same way. I dont think I can take advantage of the fact that the array only consists of the prime numbers but I might be wrong.

3

There are 3 best solutions below

3
Joel Coehoorn On BEST ANSWER

i want to make this a log(N) operation

O(log n) is commonly associated with a binary algorithm: you split the data set at some midpoint; based on this midpoint you then decide which half to keep and which to discard, and repeat with the kept half until you zero in on the result.

This works great when the data is presorted (it is, yay) and an item can be known (on it's own!) to definitely solve the query. And there's our trouble. If n is, say, 286 (2 x 11 x 13), then any of 2, 11, and 13 are potential solutions, and landing on any of them via binary search doesn't tell us we're done until we fully complete the search... we still have to check everything.

Abstracting to the general case, this suggests a simple binary search on the array is not good enough by itself, because simply finding A solution doesn't stop the search for THE solution.

But it can work if we know the result in advance. That is, if we can solve to find the smallest prime factor (regardless of index) in O(log n) time — if we know the value we're looking for is the 2, instead of the 11 or the 13 — we can then also search the array in O(log n) time, and O(log n) plus O(log n) is still O(log n).


If we look at the input, fully factoring a number is definitely NOT O(log n) -- famously so, to the point a number of important cryptographic algorithms rely on this being much worse.

But we don't need to fully factor the number. We only need the first (smallest) prime. This can obviously do a little better... but is it good enough? And unfortunately, I think the answer is still, "no".

Reading a quick refresher, I think it's likely we can beat linear time for the average/typical case, and even make it all the way to O(sqrt n) to find the first prime. O(sqrt n0) + O(log n1) is still O(sqrt n), but at least it's still better (on the surface) than O(n).


But this is where things get a little tricky. Now we also have to worry about keeping the "n" from the input variable separate from the "n" for the primes array. When we do that, we realize the "n" for the input variable involves is every integer up to that value, rather than just the (much smaller) length of the pre-computed primes array.

Therefore, for typical n inputs up to about a million or so (and that number is a complete guess, but you could write some code to validate and refine it), the linear search is likely the best you can do.


Speaking of writing code to pre-validate inputs, this brings up the next option:

O(1)

If you can put some kind of reasonable boundary on the possible input data, it's not out of the question to fully pre-compute the results for every candidate integer and put those values into a O(1) dictionary. Even a million candidates would only take about 8 MB RAM, and suddenly now every input only requires a dictionary lookup.

0
harold On

design an algorithm that can work in a complexity smaller than O(n)

The good news is that if n is not prime, then linearly going through the list is already only an O(sqrt(n)) algorithm: the smallest factor of a composite number cannot be larger than its square root. For a plain old int that's not too bad (but there's no hope of factoring an RSA key that way).

For an n that is prime, you can cut the linear search short when you reach a number larger than the square root of n (so this is still O(sqrt(n)) so far) and now you actually can use binary search to find the index of n itself (which must be the lowest prime to divide n at this point, since we just proved that n is prime), if it is in the list.

Using the PNT, I think we could even argue that the time complexity of linearly going through that list of primes until we find the lowest prime factor is O(sqrt(n) / log(n)). The division by log(n) comes from the density of primes "thinning out" as n increases (so we can reach the square root of n asymptotically faster by iterating only over the primes, as we do here, compared to iterating over the integers).

1
Dmitry Bychenko On

Well, if p divides n then p is at most sqrt(n): in the worst case n = p * p. Let's check prime numbers up to sqrt(n):

public int SmallestFactorIndex(int n) {
  int result = -1;

  for (int i = 0; i < arr.Length; ++i) {
    int d = arr[i];
  
    (int q, int r) = Math.DivRem(n, d);

    if (r == 0)
       return i;
    if (q <= d)
       return -1;
  }

  return result; 
}

So far so good we have O(sqrt(n)) time complexity which is better than O(n). We can be better off with more elaborated algorithms e.g. Pollard's rho which has O(sqrt(sqrt(n))) = O(n^1/4) time complexity, but still no logarithms here. For logarithm time you can only check if n prime or not (AKS test), but not find a prime factor.