I'm working with disk frame and it's great so far.
One piece that confuses me is the chunk size. I sense that a small chunk might create too many tasks and disk frame might eat up time managing those tasks. On the other hand, a big chunk might be too expensive for the workers, decreasing the performance benefits from parallelism.
What pieces of information can we use to make a better guess for chunk size?
This is a tough problem and I probably need better tools.
Currently, everything is on guess basis. But I have made a presentation on this and I will try to bring it into the docs soon.
Ideally, you want
RAM Used = number of workers * RAM usage per chunk
So, if you have 6 workers (ideal for 6 CPU cores), then you would want smaller chunk vs someone with 4 (workers) but same amount of total RAM.
The difficult is in estimating "RAM usage per chunk" which is different for different operations like merge, sort, and just vaniall filtering!
This is a hard problem to solve in general! So no good solution for now.