golang strange allocations in slice of pointers

116 Views Asked by At

I have simple benchmark to compare performance for creating slice of structs and slice of pointers to that structs

package pointer

import (
    "testing"
)

type smallStruct struct {
    ID int
}

func newSmallStruct(id int) *smallStruct {
    return &smallStruct{ID: id}
}

func BenchmarkSmallStructPointer(b *testing.B) {
    for n := 0; n < b.N; n++ {
        var slice = make([]*smallStruct, 0, 10000)
        for i := 0; i < 10000; i++ {
            t := newSmallStruct(n + i)
            slice = append(slice, t)
        }
    }
}

func BenchmarkSmallStruct(b *testing.B) {
    for n := 0; n < b.N; n++ {
        var slice = make([]smallStruct, 0, 10000)
        for i := 0; i < 10000; i++ {
            t := newSmallStruct(n + i)
            slice = append(slice, *t)
        }
    }
}

Result of benchmark

go test -bench . -benchmem
goos: linux
goarch: amd64
pkg: test-project/pointer
cpu: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
BenchmarkSmallStructPointer-4               3121            328864 ns/op          161921 B/op      10001 allocs/op
BenchmarkSmallStruct-4                     29218             48021 ns/op           81920 B/op          1 allocs/op

Please, explain me, what operation produced so many allocations for slice of pointers? It seems to be append to slice, but I don't understand why?

1

There are 1 best solutions below

0
coxley On

Go has a feature called "escape analysis" that logs what values escape to the heap — and why.

It's enabled with a compiler flag with optional verbosity: -gcflags='-m=2'. You can set it on go test, go build, go run. (Anything that passes that flag along to go tool compile)

Using this we can see that &smallStruct escapes to the heap, forcing an allocation. If I'm interpreting the results correctly, it's because even though newSmallStruct is inlined in both testcases, the compiler is able to see that the pointer is immediately dereferenced in the second case. Whereas the first case continues to pass the pointer to other functions (append):

> go test -bench=. -benchmem -gcflags='-m=2' .
# ...

./main_test.go:12:9: &smallStruct{...} escapes to heap:
./main_test.go:12:9:   flow: ~r0 = &{storage for &smallStruct{...}}:
./main_test.go:12:9:     from &smallStruct{...} (spill) at ./main_test.go:12:9
./main_test.go:12:9:     from return &smallStruct{...} (return) at ./main_test.go:12:2

# ...
./main_test.go:29:23: &smallStruct{...} does not escape

With higher verbosity (3):

./main_test.go:19:23: &smallStruct{...} escapes to heap:
./main_test.go:19:23:   flow: ~R0 = &{storage for &smallStruct{...}}:
./main_test.go:19:23:     from &smallStruct{...} (spill) at ./main_test.go:19:23
./main_test.go:19:23:     from ~R0 = &smallStruct{...} (assign-pair) at ./main_test.go:19:23
./main_test.go:19:23:   flow: t = ~R0:
./main_test.go:19:23:     from t := ~R0 (assign) at ./main_test.go:19:6
./main_test.go:19:23:   flow: {heap} = t:
./main_test.go:19:23:     from append(slice, t) (call parameter) at    <--
./main_test.go:20:18

This talk goes over various situations in detail: https://www.youtube.com/watch?v=ZMZpH4yT7M0

Optimizing around this

Even without append — using plain index-assignment - this will cause a heap allocation.

This is because if &smallStruct{...} was allocated in the stack frame of newSmallStruct(), that address could be overwritten later. Go re-uses old function stack space for new ones. Allocating in the heap here is the "verifiably correct" thing to do.

If you have a case where you need to optimize this away for performance, create the value-only slice first — ideally return a value from newSmallStruct() so it works when not inlined. Then make the pointer-slice reference those local values.

func BenchmarkSmallStructPointer(b *testing.B) {
    for n := 0; n < b.N; n++ {
        vslice := make([]smallStruct, 0, 10000)
        for i := 0; i < 10000; i++ {
            t := newSmallStruct(n + i)
            vslice = append(vslice, *t)
        }

        slice := make([]*smallStruct, len(vslice))
        for i := 0; i < len(vslice); i++ {
            slice[i] = &vslice[i]
        }
    }
}
BenchmarkSmallStructPointer-8              45446             25011 ns/op          163840 B/op          2 allocs/op