DSE Cluster node disk gets filled

124 Views Asked by At

I have an 6 node cluster , each node is of 1000 GB in size. But the size of one node reached to 1000 GB randomly.On analysis i found only one key space gets filled & only 1 table of this keyspace size get increased from 200 GB to 800 GB (In 24 hours ) , which means someone execute operations on this table only . I want to figure out what operations had perform on this node which leads to this size increment ? Are there any logs which can be looked at to see what operations were performed?

2

There are 2 best solutions below

2
Jim Wartnick On

I guess how I would do this is to use "nodetool tablehistograms" to prove that you have large partitions for the table. Then I would go to the table directory and run "sstablemetadata" on some of the data files, locating ones that displays some large partition sizes.

One trick you could do once you find sstables that have larger partitions is:

sstabledump <sstable> | grep  -n "\"key\" :"

What that will do is show you the line number every time the key switches, the larger the gap between lines, the more rows there are.

Here is an example:

sstabledump aa-483-bti-Data.db | grep  -n "\"key\" :"
4:      "key" : [ "PROCESSING" ],
65605:      "key" : [ "PENDING" ],
8552007:      "key" : [ "COMPLETED" ],

As you can see, the gap between PENDING and COMPLETED was much larger than PROCESSING and PENDING (65k lines v.s. 8M lines). So this tells me that the PROCESSING partition is relatively small compared to PENDING. The only mystery is how large is the COMPLETED one as there is no "ending" line. To get the total line count, run:

sstabledump aa-483-bti-Data.db | wc -l
16316029

Total line count is 16M. So COMPLETED goes from 8M to 16M, or about 8M lines. So the COMPLETED partition is large as well, about as large as the PENDING partition.

Looking at sstablemetadata to see if that matches up with the output, I see that it does:

sstablemetadata aa-483-bti-Data.db
Partition Size:
   Size (bytes)         | Count  (%)  Histogram
   943127 (921.0 kB)    |     1 ( 33) OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
   129557750 (123.6 MB) |     1 ( 33) OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
   155469300 (148.3 MB) |     1 ( 33) OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO

I see two relatively large partitions and one small one. Bingo.

Maybe some of those can help you get to the bottom of your large partition(s).

0
Aaron On

With DataStax Enterprise, you should be able to turn on the Database Auditing feature. In fact, by configuring a logger class of CassandraAuditWriter, all activity gets written to the audit_log table in the dse_audit keyspace.

The data is organized by this PRIMARY KEY: ((date, node, day_partition), event_time); and has columns like username,table_name,keyspace_name,operation and others.

Check out the DataStax docs on that for configuration and query options.

As for (open source) Apache Cassandra, we use Ericsson's Cassandra Audit plugin for this functionality. By adding in the project's JAR, and making a couple of adjustments to the cassandra.yaml file, you can view the audit.logs for records like:

15:42:41.655 - client:'10.0.110.1'|user:'flynn'|status:'ATTEMPT'|operation:'DELETE FROM ecks.ectbl WHERE partk = ?'