Parameter configuration: workerNum=64; subExecutors=32; maxMemSize=256; localExecutors=63.
In single-process deployment:
The subscribeTable function specifies the filter parameter as hash filtering with 32 buckets. The DolphinDB license has 64 cores.
According to the table pubTables returned by getStreamingStat(), the growth rate of msgOffset when bucket = 32 is much slower than that when bucket = 8. From the table subWorkers, the data distribution speed is entirely outpaced by the data processing speed.
In my system, the CPU usage always surges to the peak for multi-threading within a short period. Then, the system waits until all subworkers have finished processing (with CPU usage dropping to less than 1%) before proceeding to the next data distribution.
Moreover, the table subWorkers returned by getStreamingStat() indicates that queueDepth of most workers is often 0. The slow speed of data distribution causes severe inefficiency in post-trade processing.
My questions are:
What is the reason for slow data distribution? Is it because of the simple computing tasks of the stream engine?
How to boost the data distribution speed?
This issue is fixed in server 1.30.23 / 2.00.11. For earlier versions, there are two methods:
Add a layer of subscription. Use the
subscribeTablefunction to subscribe to the original stream table and write it to an intermediate table. Then, subscribe to the intermediate table with hash filtering.Reduce the replay speed. Configure the parameter localSubscriberNum to increase the number of threads used to distribute the messages from the publish queue.