Estimation of data volume

321 Views Asked by Dieudonné Madishon NGAYA At 18 May 2025 at 11:58

I have a Cassandra cluster with 3 nodes which has data from 3 applications. Now, we are planning to add 3 news applications that will increase the workload on the cluster, I want to know the different steps to know the future projection like, if we will add another node etc ... Is it possible to use Cassandra-stress to do that ? If yes, what elements I will look for ?

Thank you for your advice.

Original Q&A

There are 2 best solutions below

rsangar1 On 13 February 2016 at 20:35

For a 3 node cluster, if you are adding 3 more applications, along with current 3 applications, make sure that the cluster will be able to take the load. You should know the volume of reads and writes at peak time of each application. Based on reads and writes benchmark the cluster with Cassandra-Stress tool. I would recommend using different cluster for the new applications.

Jeff Jirsa On 14 February 2016 at 06:05

The cassandra-stress tool can, indeed, be used to model your expected applications, so that you can write data and see how your cluster scales. You should - for what should be obvious reasons - run against a similarly sized cluster that is similar to your hardware, but not on your live production cluster (cassandra-stress WILL increase throughput until the cluster fails, that's the point of the stress utility). You could also write a test that inserts data matching your applications into the database slowly, and execute nodetool flush to force that data into the sstables, and then calculate the change in load to determine how much bytes-per-application you should expect, and use that in traditional capacity estimation calculations.

Estimation of data volume

There are 2 best solutions below

Related Questions in CASSANDRA

Related Questions in CASSANDRA-STRESS

Trending Questions

Popular # Hahtags

Popular Questions