I have an IoT system around 100k devices, publishing their state every second to the backend written in Java/Spring Boot. Until now, I was using gRPC but I see excessive CPU usage so I was planning to let the devices publish to RabbitMQ and let the backend workers process them.
Processing: Updating the db table.
Since data from same device must be processed sequentially, I was planning to use RabbitMQ's consistent hashing exchange, and bind the n queues for n workers. But I'm not sure how it'd work with autoscaling.
I thought of creating auto-delete queues for each backend instance and binding them to the exchange but I couldn't figure out:
- How to rebalance messages already sitting in the queue?
- If connectivity issue occurs, queue might get deleted, so I need to re-forward those messages to the existing queues.
- Is there any algorithms for handle the autoscaling of workers? For instance if messages pile up, I need to spawn new workers even though cpu/memory usage is low.
I think I'll go with MQTT's shared subcriptions for this case.
https://emqx.medium.com/introduction-to-mqtt-5-0-protocol-shared-subscription-4c23e7e0e3c1
Hashseems like what I'm looking for.