I am having a hard time figuring this one out. Maybe you can help me.
Problem statement:
Imagine there is a system that records financial transactions of an account (like a wallet service). Transactions are stored in a database and each Transaction denotes an increase or decrease of the balance of a given amount.
On the application code side, when the User wants to purchase, all Transactions for his account are being pulled from the DB and the current balance is calculated. Based on the result, the customer has or has not sufficient funds for the purchase (the balance can never go below zero).
Transactions example:
ID userId amount currency, otherData
Transaction(12345, 54321, 180, USD, ...)
Transaction(12346, 54321, -50, USD, ...)
Transaction(12347, 54321, 20, USD, ...)
Those 3 from above would mean the User has 150 USD on his balance.
Concurrent access:
Now, imagine there are 2 or more instances of such application. Imagine, the User has a balance of 100 USD and bought two items worth of 100 USD at the same time. Request for such a purchase goes to two different instances, which both read all Transactions from DB and reduce them into currentBalance. In both replicas, at the same time balance equals to 100 USD. Both services allow purchase and add new Transaction Transaction(12345, 54321, -100, USD, ...) which decreases the balance by 100.
If there are two, contradictory Transactions inserted into the DB, the balance is incorrect: -100 USD.
Question:
How should I deal with such a situation?
I know that usually optimistic or pessimistic concurrency control is used. So here are my doubts about both:
Optimistic concurrency
It's about keeping the version of the resource and comparing it before the actual update, like a CAS operation. Since Transactions are a form of events - immutable entities - there is no resource which version I could grasp. I do not update anything. I only insert new changes to the balance, which has to be consistent with all other existing Transactions.
Pessimistic concurrency
It's about locking the table/page/row for modification, in case they more often happen in the system. Yeah, ok.. blocking a table/page for each insert is off the table I think (scalability and high load concerns). And locking rows - well, which rows do I lock? Again, I do not modify anything in the DB state.
Open ideas
My feeling is, that this kind of problem has to be solved on the application code level. Some, yet vague ideas that come to my mind now:
- Distributed cache, which holds "lock of given User", so that only one Transaction can be processed at a time (purchase, deposit, withdrawal, refund, anything).
- Each Transaction has having field such as
previousTransactionId- pointer to the last committed Transaction and some kind of unique index on this field (exactly one Transaction can point to exactly one Transaction in the past, first Transaction ever havingnullvalue). This way I'd get constraint violation error trying to insert a duplicate. - Asynchronous processing with queueing system, and having a topic-per-user: exactly one instance processing Transactions for given User one-by-one. Nice try, but unfortunatelly I need to be synchronous with the purchase in order to reply to 3rd party system.
One thing to note is that typically there's a per-entity offset (a monotonically increasing number, e.g.
Account|12345|6789could be the 6789th event for account #12345) associated with each event. Thus, assuming the DB in which you're storing events supports it, you can get optimistic concurrency control by remembering the highest offset seen when reconstructing the state of that entity and conditioning the insertion of events on there not being events for account #12345 with offsets greater than 6789.There are datastores which support the idea of "fencing": only one instance is allowed to publish events to a particular stream, which is another way to optimistic concurrency control.
There are approaches which move pessimistic concurrency control into the application/framework/toolkit code. Akka/Akka.Net (disclaimer: I am employed by Lightbend, which maintains and sells commercial support for one of those two projects) has cluster sharding, which allows multiple instances of an application to coordinate ownership of entities between themselves. For example instance A might have account 12345 and instance B might have account 23456. If instance B receives a request for account 12345, it (massively simplifying) effectively forwards the request to instance A which enforces that only request for account 12345 is being processed at a time. This approach can in some way be thought of as a combination of 1 (of note: this distributed cache is not only providing concurrency control, but actually caching the application state (e.g. the account balance and any other data useful for deciding if a transaction can be accepted) too) and 3 (even though it's presenting a synchronous API to the outside world).
Additionally, it is often possible to design the events such that they form a conflict-free replicated data type (CRDT) which effectively allows forks in the event log as long as there's a guarantee that they can be reconciled. One could squint and perhaps see bank accounts allowing overdrafts (where the reconciliation is allowing a negative balance and charging a substantial fee) as an example of a CRDT.