Multi-Threaded Merge Sort slower than the iterative approach C++

104 Views Asked by At
#include <iostream>
#include <vector>
#include <thread>

void mergeSort(std::vector<int>& v, int left, int right) {
    if (left < right) {
        int mid = left + (right - left) / 2;

        std::thread t1(mergeSort, std::ref(v), left, mid);
        std::thread t2(mergeSort, std::ref(v), mid + 1, right);

        t1.join();
        t2.join();

        merge(v, left, mid, right);
    }
}

I tried a multi threading approach for the merge sort algorithm, but when testing the difference between multi threaded and iterative even with inputs as large as 1M the iterative approach was way faster. What is wrong with my algorithm and how can I make it work as expected?

2

There are 2 best solutions below

2
Oliver Hawker On

With threads, it's important to consider the following:

  • What is the maximum number of threads can I actually run at once (given my hardware?). Often the number of logical cores on a system.
  • How can I reduce the frequency of threads being started/joined? Starting a thread is an expensive operation, and the work a thread does needs to outweigh the cost of starting it.
  • How (although this is less important for your particular use case) can I reduce the amount that threads access the same data as much as possible?

In your case, I think you're ending up spawning far, far more threads than your system can realistically handle. Let's take the simplest example and say you're on a dual-core machine; you would want to split your work across two threads for maximum throughput.

You might want something like:

void mergeSort(std::vector<int>& v, int left, int right, bool shouldThread) {
    if (left < right) {
        int mid = left + (right - left) / 2;
        if (shouldThread) {
            std::thread t1(mergeSort, std::ref(v), left, mid, false);
            std::thread t2(mergeSort, std::ref(v), mid + 1, right, false);

            t1.join();
            t2.join();
        }
        else
        {
            mergeSort(std::ref(v), left, mid, false);
            mergeSort(std::ref(v), mid + 1, right, false);
        }

        merge(v, left, mid, right);
    }
}

This would start the first level across two threads, and each would recurse down half of the tree.

Assuming this works (untested), your next step would be to make it start threads as far as you have useful hardware to execute them. But I'd check that you understand this level first.

0
rcgldr On

Limit the number of threads to the number of cores or 2x the number of cores. Link to an example 4 thread merge sort, which is 3 times faster than a single thread merge sort.

https://codereview.stackexchange.com/q/148025/59065