Using CassandraTemplate in spring I get a batchOps and then insert values in a batch
CassandraBatchOperations batchOps = cassandraTemplate.batchOps();
batchOps.insert(listOfEntities);
batchOps.execute();
But inserting too many entities in 1 batch will throw an exception because of BATCH_SIZE_FAIL_THRESHOLD_IN_KB setting.
Using Spring data Cassandra, how do I determine the query length of the inserts before inserting them in the batch so that I don't put too many inserts (based on the BATCH_SIZE_FAIL_THRESHOLD_IN_KB) in 1 batch and try to insert them in multiple batches?
So when a batch statement runs in Cassandra, a node is picked as the coordinator. That node has to hold all of the batched writes in memory, and then coordinate writes with all other nodes in the cluster based on the hashed token values of the primary keys in each write.
Long story short, I've seen nodes crash because of this. That's why that batch limit exists, and why it's better when using Cassandra to not batch writes, but send them to the cluster one at a time.
TL;DR;
Do not batch writes in Cassandra; rather execute them one at a time.
Edit 20240311
If I look at the docs for help on how batching works, I can see this:
So a single partition batch still uses a coordinator, but the performance hit is less because of only a subset of nodes in the cluster will be required.
TBH, if I was doing it, I wouldn't bother with batch; I'd do them all one-at-a-time. Especially with Spring. Spring makes it easier to do simple things. However, with more complex operations, we're stuck with Spring's interpretation and implementation of things, which sometimes does get in the way (as you are finding out).