How to test that a specific sorting algorithm was actually implemented?

92 Views Asked by At

I'm looking for an automated test to validate that merge sort was actually implemented. I know how to validate that a sorting algorithm actually sorts, but how do I validate that merge sort was used instead of, say, bubble sort?

I looked back at my old lectures and they hardcoded a target time of 0.01s for a 20,000 element list. They did not leave any comment of how they came up with 0.01s. So my question is how do I validate that merge sort was implemented?

The following code shows this "ghost" 0.01 that the lecture came up with. Where the did this 0.01 come from?

FYI, BIG_SORT_SIZE = 20,000

 /** @return true if test passes, else false */
 private boolean testTimeToSortBigList() {
    final int bigNum = BIG_SORT_SIZE; //okay, not THAT big
    final double maxTime = 0.02;
    final double targetTime = 0.01;
    try {
        IndexedUnsortedList<Integer> list1 = newList();
        Random rand = new Random(123);
        for (int i = 0; i < bigNum; i++) {
            list1.add(new Integer(rand.nextInt()));
        }

        long startTime = System.nanoTime();
        Sort.sort(list1);
        long endTime = System.nanoTime();
        long totalTime = endTime - startTime;
        double seconds = (double)totalTime/10e9;
        System.out.printf("\nTime to sort %d random integers: %.3f seconds\n", bigNum, seconds);
        System.out.printf("Target time < %.3f seconds. Time > %.3f suggests O(n^2) runtime.\n", targetTime, maxTime);

        return (seconds < maxTime);
    } catch (Exception e) {
        System.out.printf("caught unexpected %s\n", e.toString());
        return false;
    }
}   
3

There are 3 best solutions below

1
Keshav Chaudhary On

You could probably try with worst case examples to test the time differences for various algorithms . For example bubble sort on a very long sequence of decreasing numbers will perform much slower than merge sort for the same sequence. It is not automated but if you want to check the algorithm , this might be a brute force approach to it.

1
trincot On

how do I validate that merge sort was implemented?

If you only have the Sort.sort method and no other method that can execute the sort algorithm step by step, then this is not generally possible.

The following code shows this "ghost" 0.01 that the lecture came up with. Where the did this 0.01 come from?

From a mind that made lots of assumptions about the computer that the program is running on and about the function that is being tested. This is not reliable, because:

  • I could be running the program with mergesort implemented, but on an old computer that runs 100 times slower than modern day computers

  • I could be running the same program on a virtual machine that is low on memory and thus triggers memory swaps to/from disk, slowing down things considerably

  • I could be running the program on the worlds largest supercomputer, but using bubblesort, giving running times for 20,000 values that beat mergesort running on the same inputs on a "normal" PC.

  • The function could implement another sort algorithm that has a similar running time, like some version of quicksort, introsort, tournament sort, heapsort, treesort, ...

  • The mergesort implementation could use an inefficient comparison function, or inefficient swap function, slowing down the running time, but still implementing mergesort correctly. Similarly, it could have additional overhead, like logging actions to disk. But this would not make it not mergesort.

  • The function could itself call the native (highly optimised) sort function (which is something like timsort) and then optionally add some artificial waiting time so to align with the expected running time of the testing code.

Even if you only wanted to know whether the implementation uses a O(log) algorithm, the main issue still is that time complexity cannot be measured by merely executing the implementation! A O(log) algorithm may need more time to sort 20,000 elements than a O(²) algorithm. Time complexity only tells us something about asymptotic growth, not about how measured running times will compare for a chosen input size.

0
btilly On

The best way, if you don't have the code, is to sort objects with a custom comparator. This gives you the exact sequence of comparisons made. Which is very telling about the exact implementation.

But be warned, slight implementation changes will make it look like a different sort algorithm. For example with a merge sort, switching the recursive calls changes most of the comparisons.