This article tells about control queues Nsb Master node uses to control message load, though to me it's still not clear how to interpret disproportions in number of messages in this queues: https://docs.particular.net/nservicebus/msmq/distributor/
I'm observing slowness in my Nsb service which have never experienced slowness before. For some reason less parallel threads are created per every master node comparing to the past time, and there have been no change in workers or master nodes configuration, like max amount of threads to allocate. I'm trying to figure out if it's Master node that does not want to feed workers, or workers do not want to take more job.
I see that amount of messages in control queue jumps from 15 to 40, while storage has only 5-8. Should I interpret that as workers ready to work, while Distributor can't send them more messages? Thanks
The numbers in the control and
storage queuewill jump up and down as long as the distributor is handing out messages. A message coming into thecontrol queuewill immediately be popped off that queue and onto thestorage queue. A message coming into theprimary queueof the distributor will immediately result in the first message of thestorage queueto be popped off. It's hard to interpret the numbers of messages in the queues of a running distributor, because, by the time you look at the numbers with Computer Management or Queue Explorer, they will have changed.The extreme cases are this:
1. No messages in the primary input queue of the distributor and no work happening on any of the workers.
2. All workers are working at full capacity. None able to take on more work.
In a running system, it can be anything between these two extremes, so, unfortunately, it's hard to say much from just a snapshot of the
controlandstorage queue.Some troubleshooting tips:
If the
storage queueis empty, the distributor can not hand out more work. It does not know where to send it. This happens if all the workers are fully occupied as they will not be sending any ready-messages back to thecontrol queueuntil they finish up handling a message.If the
storage queueis consistently small compared to the total number of worker threads across all the workers, you are approaching the total maximum capacity of your workers.I suggest you start looking at the logs of the workers and see if the work they are doing is taking longer than usual. Slower database/third party integration?
Another thing to check is if there has been anything IO-heavy added to the machine hosting the distributor. If the distributor was already running at close to max capacity, adding extra IO might slow down MSMQ on the box, giving you worse throughput.