Event-sourcing often implies to have one row per aggregate id :
| event_id | event_type | entity_type | entity_id | event_data |
|---|---|---|---|---|
| 102 | OrderCreated | Order | 101 | {...} |
| 103 | OrderUpdated | Order | 101 | {...} |
But what if you have some batch action that results to a bunch of events such as :
- "Mark 10 000 e-mails as read" => 10 000 EmailReadEvent
- "Update Device status of 10000 devices" => 10 000 DeviceStatusUpdatedEvent
For such scenarios (with replicas in other microservices):
- Should we store those 10 000 events in the event store ?
- Should we publish 10 000 events to the broker, and consume one by one each event on the subscriber side ?
This seems to be a waste of resources, and very unefficient. Unfortunately, I cannot find any resource on the web about this specific topic. My idea for now is to design my domain events to all have a list of ids, in case of batch action. EmailReadEvent would have a list of email ids, and DeviceStatusUpdatedEvent would have a list of device ids. Same thing for the event store, I would have a list of ids in the entity_id column.
What do you think about this approach ? Is there any better way to do this ?
The limitation of one aggregate ID per event doesn't really come from event sourcing: it comes from the fact that these are events for an aggregate, which implies that every event concerning a given email/device/etc. was emitted with the knowledge of all previous events concerning that email/device/etc.
You can event source without having aggregates (just like you can have aggregates without event sourcing).
It's also possible to have events that end up associated with multiple devices/emails and have the consistency benefits of an aggregate: all of those emails/devices can "just be" entities in the same aggregate. In a lot of contexts, this is actually viable (e.g. an email inbox or all the emails received in a day in a given inbox could well be a cromulent aggregate; devices that needed a status update at the same time might well belong to the same customer or be deployed to the same site).
As a couple of side notes...
This is, to my mind, a potentially strong sign that something has gone wrong, depending on what you mean by "replica" and "other microservice". I would be wary of the possible implication that you're combining ideas that want to be in different bounded contexts, which would mean that you could be making a lot of things more complex than they need to be. Alternatively, this could be a consequence of technical constraints forcing more microservices than would be needed to manage complexity.
There's not necessarily a reason any particular subscriber has to consume events one-by-one (some technical choices could force this on you: you can likely revisit those choices).