Why is this concurrent code slower than the serial code? (Go)

65 Views Asked by At

I wanted to try using a concurrent solution to problem 206 (Reverse a Linked List) on LeetCode, so I wrote this:

func reverseList(head *ListNode) *ListNode {
    var prev, temp *ListNode
    cur, ch := head, make(chan bool)

    for cur != nil {
        temp = cur.Next
        go func(cur *ListNode, prev *ListNode) {
            cur.Next = prev
            ch <- true
        }(cur, prev)
        <-ch
        cur, prev = temp, cur
    }

    return prev
}

my previous, serial solution was this:

func reverseList(head *ListNode) *ListNode {
    var cur, prev *ListNode = head, nil
    for cur != nil {
        prev, cur, cur.Next = cur, cur.Next, prev
    }

    return prev
}

Here are the benched functions:

func BenchmarkReverseLinkedListConcurrent(b *testing.B) {
    slice := generateLinkedListSlice(1000000, b.N) //generates slice of Linked Lists

    b.ResetTimer()
    var cur, prev, temp *ListNode
    var ch chan bool
    for i := 0; i < b.N; i++ {
        fmt.Println("test")
        cur, ch = slice[i], make(chan bool, 5000)

        for cur != nil {
            temp = cur.Next

            go func(cur *ListNode, prev *ListNode) {
                cur.Next = prev
             ch <- true
            }(cur, prev)

            <-ch

            cur, prev = temp, cur
        }
    }
}

func BenchmarkReverseLinkedListSerial(b *testing.B) {
    slice := generateLinkedListSlice(1000000, b.N) //generates slice of Linked Lists

    b.ResetTimer()
    var cur, prev *ListNode
    for i := 0; i < b.N; i++ {
        fmt.Println("test")
        cur = slice[i]
        for cur != nil {
            prev, cur, cur.Next = cur, cur.Next, prev
        }
    }
}

I benched both of them and got these results:

goos: windows
goarch: amd64
pkg: benchmark-tests
cpu: AMD Ryzen 5 4500U with Radeon Graphics
BenchmarkReverseLinkedListConcurrent-6               602           1812494 ns/op
BenchmarkReverseLinkedListConcurrent-6               673           1767912 ns/op
BenchmarkReverseLinkedListConcurrent-6               680           1557250 ns/op
BenchmarkReverseLinkedListConcurrent-6               771           1896961 ns/op
BenchmarkReverseLinkedListConcurrent-6               634           1874298 ns/op
BenchmarkReverseLinkedListSerial-6                332670             33707 ns/op
BenchmarkReverseLinkedListSerial-6                308980             34237 ns/op
BenchmarkReverseLinkedListSerial-6                278793              3735 ns/op
BenchmarkReverseLinkedListSerial-6                326935              3989 ns/op
BenchmarkReverseLinkedListSerial-6                279200              4182 ns/op

Is there something wrong with my concurrent solution, or is it some other problem, like an overhead?

1

There are 1 best solutions below

5
TehSphinX On

There is some overhead to:

  • starting a goroutine
  • sending a value over a channel

That overhead must be justified by doing "enough" work in parallel. Multiple CPUs can cut down the time it takes to do a task if the task can be split well. But the reduction of time it takes to do the task must outweigh the communication (channel) and setup (go routines) overhead.

In your example, you are not doing significant work in parallel. All the goroutine does is cur.Next = prev. If instead you had cur.Next = someHeavyOperation(prev), things would be different.

Next issue with the example is that the main routine does nothing but wait for the goroutine to return. Effectively that only puts the work on another routine (with all the overhead mentioned above) but then doesn't utilize the CPU of the main routine. Meaning there is no parallel usage of the 2 goroutines. Only one routine is working at a time.