Cassandra OversizedMessageException

30 Views Asked by At

I’m finding below error occasionally telling that the message size is oversized.

The allow limit 134217728 (did a simple math) is 128Mb and I cannot think of what may cause such big data.

Will this impact the integrity of data? And is there something I can do to avoid the error e.g. resize some param on Cassandra.yaml?

ERROR [ReadStage-1] 2024-03-29 05:36:26,158 JVMStabilityInspector.java:68 - Exception in thread Thread[ReadStage-1,5,SharedPool]
org.apache.cassandra.net.Message$OversizedMessageException: Message of size 142675369 bytes exceeds allowed maximum of 134217728 bytes
        at org.apache.cassandra.net.OutboundConnection.enqueue(OutboundConnection.java:331)
        at org.apache.cassandra.net.OutboundConnections.enqueue(OutboundConnections.java:92)
        at org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:417)
        at org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70)
        at org.apache.cassandra.net.MessagingService.send(MessagingService.java:406)
        at org.apache.cassandra.net.MessagingService.send(MessagingService.java:376)
        at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
        at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
        at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
        at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
        at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
        at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:124)
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:120)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Unknown Source)
1

There are 1 best solutions below

0
Mário Tavares On

Will this impact the integrity of data?

No. This exception was triggered in a ReadStage thread - This type of thread is responsible for local reads, which don't modify the dataset in any way.

And is there something I can do to avoid the error e.g. resize some param on Cassandra.yaml?

Yes. I would start by finding the root cause and addressing it, rather than changing configuration. I can think of likely 2 scenarios where this exception would be triggered:

  1. The client scanned through a large partition in a single query (exceeding ~128 MiB). To validate this you can verify what's the max partition uncompressed size by running the following:
    1. Cassandra 4.1.X and above:
      nodetool tablestats -s compacted_partition_maximum_bytes -t 1

    2. Previous versions:
      nodetool tablestats | grep "Compacted partition maximum bytes" | awk '{print $5}' | sort -n | tail -1

If you see a partition over 128MiB, then it may be necessary to check if there is a query reading whole partitions in the correspondent table. And if there is one, rethink the data model in order to control partition size. One common solution to this problem is to bucket partitions by time or other arbitrary fields that can split the partitions in a balanced way.

  1. A client is issuing a range scan. This includes read queries that read multiple partitions, such as queries that need ALLOW FILTERING and don't filter by partition key, and it's usually very expensive in Cassandra. Generally you'll be able to catch those in debug.log through slow query logs. If this is the case, I strongly recommend to consider modeling a table for each of those queries so that all reads are single-partition reads and the database performance scales well with the workload.

Finally, the quick configuration fix (in Cassandra 4.X) is to edit the following parameters in cassandra.yaml and restart nodes to apply changes:
internode_application_send_queue_reserve_endpoint_capacity_in_bytes - defaults to 134217728
internode_application_receive_queue_reserve_endpoint_capacity_in_bytes - defaults to 134217728

Feel free to check the official documentation on internode messaging here.