I have simple benchmark to compare performance for creating slice of structs and slice of pointers to that structs
package pointer
import (
"testing"
)
type smallStruct struct {
ID int
}
func newSmallStruct(id int) *smallStruct {
return &smallStruct{ID: id}
}
func BenchmarkSmallStructPointer(b *testing.B) {
for n := 0; n < b.N; n++ {
var slice = make([]*smallStruct, 0, 10000)
for i := 0; i < 10000; i++ {
t := newSmallStruct(n + i)
slice = append(slice, t)
}
}
}
func BenchmarkSmallStruct(b *testing.B) {
for n := 0; n < b.N; n++ {
var slice = make([]smallStruct, 0, 10000)
for i := 0; i < 10000; i++ {
t := newSmallStruct(n + i)
slice = append(slice, *t)
}
}
}
Result of benchmark
go test -bench . -benchmem
goos: linux
goarch: amd64
pkg: test-project/pointer
cpu: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
BenchmarkSmallStructPointer-4 3121 328864 ns/op 161921 B/op 10001 allocs/op
BenchmarkSmallStruct-4 29218 48021 ns/op 81920 B/op 1 allocs/op
Please, explain me, what operation produced so many allocations for slice of pointers? It seems to be append to slice, but I don't understand why?
Go has a feature called "escape analysis" that logs what values escape to the heap — and why.
It's enabled with a compiler flag with optional verbosity:
-gcflags='-m=2'. You can set it ongo test,go build,go run. (Anything that passes that flag along togo tool compile)Using this we can see that
&smallStructescapes to the heap, forcing an allocation. If I'm interpreting the results correctly, it's because even thoughnewSmallStructis inlined in both testcases, the compiler is able to see that the pointer is immediately dereferenced in the second case. Whereas the first case continues to pass the pointer to other functions (append):With higher verbosity (3):
This talk goes over various situations in detail: https://www.youtube.com/watch?v=ZMZpH4yT7M0
Optimizing around this
Even without
append— using plain index-assignment - this will cause a heap allocation.This is because if
&smallStruct{...}was allocated in the stack frame ofnewSmallStruct(), that address could be overwritten later. Go re-uses old function stack space for new ones. Allocating in the heap here is the "verifiably correct" thing to do.If you have a case where you need to optimize this away for performance, create the value-only slice first — ideally return a value from
newSmallStruct()so it works when not inlined. Then make the pointer-slice reference those local values.