I have 6 partitions for a certain topic, and 4 consumers consuming from that topic. The producer produces in a round robin manner to the partitions. The 4 consumers belong in the same consumer group.
I can see with some load testing that 2 of the partitions are getting consumed very slowly, while the others are almost always empty. I would like to increase my throughput as much as possible.
- What will be the the the default partition assignment strategy from kafka?
- If the load increases at some time I would like to scale my consumers up to 6 (same number as partitions so it is a 1-1 consumer to partition). In the 4 consumers scenario to achieve the best possible throughput should I limit my producer to produce only to 4 partitions until I have increased the number of my consumers?
Which kafka version are you using?
It seems your producers are not using efficient method for partitioning.
You can write custom partition with efficient hash algo which distribute messages equally and give fair chance to consumers to consume the message in parallel