I am building an application that follows a clean architecture approach. It tries to ensure the domain layer is independent of any infrastructural layer. But I'm looking at a lot of examples that implement an event sourced aggregates. Weirdly enough, almost all examples have a Version attribute on the base aggregate class. Something like this
abstract class Aggregate {
public ID: string;
public version: string;
// the rest
}
And I see this version attribute holding no value to the entire domain logic. The only reason it exists is to help with event sourcing and transaction concurrency reasons. This to me seems like a leak. But I don't know what the alternative is.
So my question is to firstly verify that this is a leak. And secondly, is to ask how to address this.
Yes, it is.
Most people just ignore it. Really.
What you are seeing here is a side effect of the fact that most people expect to use optimistic concurrency when writing to an event stream.
For that to work, you need some information taken from the database read to be available to you when you perform the database write - in other words, something analogous to "compare-and-swap". (Ever had a git repository reject your push because somebody else pushed first? same idea.)
So that metadata, whatever form it takes, needs to be available at the point in your execution where you are creating the message to save your changes into the event store.
(Note: this isn't just an event store constraint - if you were using optimistic concurrency on a shared document store, or a shared relational database, you'd have similar issues.)
So the riddle then becomes: which tradeoffs am I willing to make to get the information I need to the place where I will need it.
One of those tradeoffs is the question: are you going to build a fresh copy of the entity from its history every time, or are you going to re-use a cached copy of the entity from the last time you used it?
And if you are going to use a cached copy of the entity, then you need to have some mechanism available to you to determine if the copy of the entity you have is consistent with the copy of the history you have.
Expressing the same idea another way: the notion that our entity in memory is the single "source of truth" is itself a leaky abstraction, and if our goal is a reliable system then we need to address the differences between the abstraction and reality.
The first step of addressing it is to evaluate whether or not it is something that needs to be addressed.
Nobody is giving away prizes to implementations that separate domain and persistence perfectly. Solutions designed with persistence metadata in the model can still deliver value to the customer.
But alternatives include:
allow the metadata in the model, but only as an opaque token. The domain model is independent of any specific implementation of the token, so your dependency graph still points the right direction (ie, you can change your persistence without needing to modify the domain code).
keep the meta data "on the envelope"; not in the domain object, but in the value that holds the cached copy of the entity.
don't cache, just rebuild the entity from scratch each time, and keep the metadata available locally until you no longer need it.
separate the domain behaviors from your data models; in other words, implement your domain logic as a collection of traits/mixins, and create persistence specific data models to host them.
None of these are great - you give away some of this to get some of that. So you make the best trade you can and get on with it.