I am working on a problem that involves a producer-consumer pattern. I have one producer who produces the task and 'n' consumers that consumes the task. A consumer task is to read some data from a file and then upload that data to S3. One consumer can read up to xMB(8/16/32) of data and then uploads it to s3. keeping all the data in memory was causing memory consumption that was more than what is expected from the program so I switched to reading the data from file and then writing it to some temporary file and then uploading the file to S3, though this performed better in terms of memory but CPU took a hit. I wonder if there is any way to allocate a fixed size of memory once and then use it among different goroutines? What I would want is that if I have 4 goroutines then I can allocate 4 different array of xMB and then use the same array in each goroutine invocation, so that a goroutine doesn't allocate for memory every time and also doesn't depend on GC to free the memory?
Edit: Adding a crux of my code. My go consumer looks like:
type struct Block {
offset int64
size int64
}
func consumer (blocks []Block) {
var dataArr []byte
for _, block := range blocks {
data := file.Read(block.offset, block.size)
dataArr = append(dataArr, data)
}
upload(dataArr)
}
I read the data from file based on Blocks, this block can contain several small chunks limited by xMB or one big chunk of xMB.
Edit2: Tried sync.Pool based on suggestions in comment. but I did not see any improvement in memory consumption. Am I doing something wrong?
var pool *sync.Pool
func main() {
pool = &sync.Pool{
New: func()interface{} {
return make([]byte, 16777216)
},
}
for i:=0; i < 4; i++ {
// blocks is 2-d array each index contains array of blocks.
go consumer(blocks[i])
}
}
go consumer(blocks []Blocks) {
var dataArr []byte
d := pool.(Get).([]byte)
for _, block := range blocks {
file.Read(block.offset,block.size,d[block.offset:block.size])
}
upload(data)
pool.put(data)
}
Take a look at SA6002 of StaticCheck, about
sync.Pool. You can also usepproftool.