sync: negative WaitGroup counter

143 Views Asked by At

Who can tell me why the panic occurs. The background of this demo is to verify the behavior of slice cocurrent write, but when I run the demo, panic occurs occasionally (not always). So Why?

The code:

func main() {
    for i := 0; i < 10000; i++ {
                // when capacity > 2, panic disappear
        b := make([]int, 2, 2) 
        b[0], b[1] = 0, 1

        wg := sync.WaitGroup{}
        wg.Add(1)
        go func() {
            b = append(b, 3)
            wg.Done()
        }()
        wg.Add(1)
        go func() {
            b = append(b, 4)
            wg.Done()
        }()
        wg.Wait()
    }

}

Errors:

panic: sync: negative WaitGroup counter

goroutine 12821 [running]:
sync.(*WaitGroup).Add(0x0?, 0xc0000f4ba0?)
        /usr/local/go/src/sync/waitgroup.go:83 +0xda
sync.(*WaitGroup).Done(...)
        /usr/local/go/src/sync/waitgroup.go:108
main.main.func1()
        /Users/liushi/Projects/go/golang-demo/main.go:54 +0x97
created by main.main
        /Users/liushi/Projects/go/golang-demo/main.go:48 +0x11f

Process finished with the exit code 2

When the capacity of b more than 2, panic disappears.

2

There are 2 best solutions below

3
Maria Ines Parnisari On

As an FYI: b is a slice. Slices have fixed size arrays backing them. In your case, b has two elements and it's already full. So when you try to append, Go has to do a bit more work before appending, it has to copy all elements into a new backing array. If you change the initial capacity to 3, the first time you call append will execute quickly.

The problem with your code is that you have a data race on the b variable. This is easy to verify if you run with go test -race . To fix this, you could use a mutex to control concurrent access to b.

Now, I don't know exactly why it's panicking on wg.Add(1), but since you currently are corrupting memory, first fix that and then see if you get panics. I think you shouldn't get any because the WaitGroup portion looks fine to me.

0
Ishan Goel On

Maria's explanation that there's a data race for b is correct. Here's the right way to implement this with a mutex:

package main

import "sync"

func main() {
  mut := sync.Mutex{}
  for i := 0; i < 10000; i++ {
    // when capacity > 2, panic disappear
    b := make([]int, 2, 2)
    b[0], b[1] = 0, 1

    wg := sync.WaitGroup{}
    wg.Add(1)
    go func() {
      mut.Lock()
      b = append(b, 3)
      mut.Unlock()
      wg.Done()
    }()
    wg.Add(1)
    go func() {
      mut.Lock()
      b = append(b, 4)
      mut.Unlock()
      wg.Done()
    }()
    wg.Wait()
  }

}

Another clue that this may be due to memory corruption is the fact that when the capacity is increased the panic disappears. Currently, with capacity 2, the Go runtime is forced to double the underlying array's size before it appends the value 3, and the data race could mean that this increase interferes with where memory for wg is stored.