I am working on a Grails version 2.5 project. I am facing a problem with Gorm / Hibernate. The project is deployed on 2 servers, each would be getting a call in a round robin fashion. When we hit one of our APIs, its trying to create a new entity and persist in it the DB.
The problem is, when we call this API, almost every other request is failing with error message like:
com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '493' for key 'PRIMARY'
I tried setting the id value, but its not taking that value, and its trying its own value. I also tried mapping id generator 'assigned' in the child class (can't touch base class which has generator set to identity instead as there are other child classes)
My question is, how can we purge what hibernate thinks would be the next value of the primary key? I have tried disabling second level cache and query cache, but that doesn't work. Do we have a way to programmatically purge hibernate cache, so that it goes to the DB to see what max id value is there? Not just that, if you are aware of any other way to solve it, then please post them too.
EDIT:
Here are the mappings of the base class and child class:
Base class:
abstract class BaseEntity implements Serializable {
static mapping = {
tablePerHierarchy false // avoid creating the base_domain table
tablePerConcreteClass true
id generator: 'identity'
createdDate index: true // index this column in each subclass table
inUse index: true // index this column in each subclass table
}
}
Child class:
class ActualEntity1 extends BaseEntity {
static mapping = {
id generator: 'assigned'
}
}
These are gorm domain classes. I removed the other fields of the classes intentionally.
It is true that using multiple servers in a round-robin fashion could lead to race conditions for the next value of the primary key. When you are using the
identitygenerator, Hibernate relies on the underlying database to generate IDs, typically via auto-increment columns.The most straightforward way to handle this situation in a multiserver environment would be to use a sequence generator, the database will always generate a unique identifier, ensuring there are no duplicate keys.
That, however, involves modifying the database schema.
You can start and try using UUIDs (Universally Unique Identifiers) as primary keys. UUIDs are generated in such a way that they are virtually guaranteed to be unique.
That will not require changes in your DB schema, but you will need to modify your domain objects to support UUIDs as primary keys.
It does have the advantage of virtually eliminating the possibility of collisions.
But: UUIDs are larger than regular integer IDs and can be harder to manage in some situations.
Since you have tried using the
assignedgenerator in theActualEntity1, you can programmatically determine the next ID by querying the database for the maximum current ID and then adding one to it.That, however, has a potential risk of race conditions. You would have to implement some locking mechanism to ensure the ID is not duplicated.
You can also rely on external services:
Build a separate microservice responsible solely for generating unique IDs. Every time a new entity needs to be persisted, this service is called first to retrieve a new ID. The main advantage of this approach is that the ID generation logic is abstracted away and centralized. But this introduces another point of failure and additional latency.
Use a distributed system tool like ZooKeeper or etcd for maintaining a global unique ID generator. These systems can maintain distributed locks, ensuring that you always get unique IDs, even across multiple servers. However, introducing a new system component like this can be complex.
See for illustration "Generating Distributed UUID's using Zookeeper", from Phani Kumar Yadavilli, May 2019.
OK. Given the constraints, another approach could be to use a combination of programmatic assignment and synchronization.
That would involve using a dedicated database table as a distributed lock mechanism. It would help making sure only one server can assign IDs at a given time, effectively eliminating the race condition.
You would need to:
DistributedLockwith a single row.SELECT ... FOR UPDATEstatement, which tries to lock the row for the duration of the transaction.That approach does add a bit of overhead for the locking, but ensures that the race condition is handled without changing primary key types or introducing external systems.
A pseudo-code could look like:
However, if the system will be under very heavy load with frequent entity creations, this might not be the most scalable solution and you may need to explore other avenues.