Increased run time of binary search when trying to fix IndexError

31 Views Asked by At

I recently followed a youtube tutorial on Binary Search and I ran into a problem with list index being out of range when I ran the code the youtuber had built. As far as I can tell I did exactly what they did, and now I am wondering what I did wrong. I tried removing the "if list[midpoint] == target:" for "if midpoint == target:" but that opened up another can of worms where the midpoint slowly increased making the code much slower than I anticipated... Below is the code:

def binary_search(list, target, low=None, high=None):
    if low is None:
        low = list[0]
    if high is None:
        high = list[-1]
    midpoint = (low+high) // 2

    if high < low:
        return print("Target was not in the list")
    if list[midpoint] == target:
        return midpoint
    elif target < list[midpoint]:
        return binary_search(list, target, low, midpoint-1)
        # new max == midpoint-1 (because it was not the target)
    else:
        return binary_search(list, target, midpoint+1, high)
        # new min == midpoint+1 (because it was not the target)`

if __name__ == '__main__':
    # tests how fast the binary search is by randomly generating a sorted list
    length = 100
    sorted_list = set()
    while len(sorted_list) < length:
        sorted_list.add(random.randint(-3*length, 3*length))
    sorted_list = sorted(list(sorted_list))

    start = time.time()
    for target in sorted_list:
        binary_search(sorted_list, target)
    end = time.time()
    print("Binary search time: ", (end - start)/length, "seconds")`

I am very confused and would appreciate any feedback you can give me.

As mentioned, I "fixed" the issue by exchanging "if list[midpoint] == target:" for "if midpoint == target:" as well as "elif target < list[midpoint]" for "elif target < midpoint". So far so good, no index error! But then when I was debugging and keeping track of "midpoint == target", whenever the midpoint was found, it began to increase the midpoint repeatedly which increased the expected time. I know that "print("Binary search time: ", (end - start)/length, "seconds")" should likely be around 6 with a list that is much larger (length = 1000) so it surprises me that a length of 100 gets around 3 on most of my tests (and my PC is not that slow).

1

There are 1 best solutions below

0
Ukulele On BEST ANSWER

low, high and midpoint are meant to be indexes - position numbers - in the list, not the actual elements of the list. This is what was causing IndexError as the list elements were too large/small to be interpreted as indexes.

So low = list[0] and high = list[-1] set low and high as list elements; to make them indexes you can set low = 0, high = len(list) - 1.

Also, it might not be a good idea to call your list list as this is the default python name for the list class (although it doesn't cause problems here as it's within the scope of a function)

import random

def binary_search(list, target, low=None, high=None):
    if low is None:
        low = 0 # Sets low to an index
    if high is None:
        high = len(list) - 1 # Sets high to an index
    midpoint = (low+high) // 2

    if high < low:
        return print("Target was not in the list")
    if list[midpoint] == target:
        return midpoint
    elif target < list[midpoint]:
        return binary_search(list, target, low, midpoint-1)
        # new max == midpoint-1 (because it was not the target)
    else:
        return binary_search(list, target, midpoint+1, high)
        # new min == midpoint+1 (because it was not the target)`

if __name__ == '__main__':
    # tests how fast the binary search is by randomly generating a sorted list
    length = 100
    sorted_list = set()
    while len(sorted_list) < length:
        sorted_list.add(random.randint(-3*length, 3*length))
    sorted_list = sorted(list(sorted_list))

    start = time.time()
    for target in sorted_list:
        binary_search(sorted_list, target)
    end = time.time()
    print("Binary search time: ", (end - start)/length, "seconds")

Binary search time: 6.914138793945312e-07 seconds