How to avoid read fail after 'COPY TO', with error 'Scanned over 100001 tombstones?

53 Views Asked by At

scenario: We use 'COPY TO' to export queried data from one cassandra node (named node1), and then use 'COPY FROM' to import these csv data to another cassandra node (named node2).

After data migrate, we query data on node2. Query failed and get ERROR info in system.log.

Our query command (query in VScode) is: select * from pattern.task where session_name='xxx'

Get the Query Error:

ResponseError: Server failure during read query at consistency LOCAL_ONE (1 responses were required but only 0 replicas responded, 1 failed).

The ERROR info in system.log is:

ERROR [ReadStage-2] 2024-01-11 18:16:43,502 StorageProxy.java:2025 - Scanned over 100001 tombstones during query 'SELECT * FROM pattern.task WHERE session_name = XXX LIMIT 5000' (last scanned row token was - xxx and partion key was ((xxx), xxx)); query aborted

However, if we run query in ./bin/cqlsh, we can query success. Also, the same query command in node1 can success.

How can I fix this problem? And this problem is related to tombstone or 'COPY FROM' command?

The 'COPY FROM' command finished successfully, so I am not sure whether this problem is caused by this command. But we can use the same command in node1.

2

There are 2 best solutions below

1
Madhavan On

There are many unknown things in this question.

  1. What is the version of Cassandra (C*) being used here?
  2. What is the cluster topology? i.e. how many DCs and how many nodes in each DC.
  3. What drivers / plugins are you using to connect VSCode with your OSS C* / DSE cluster?
  4. How does the schema of this patterns.task table look like? Run DESCRIBE TABLE patterns.task; via the CQLSH to fetch that.

Having said that, it appears to me that this cluster is suffering from multiple problems like,

  • First and foremost, the data modeling is not done right, i.e. the table is not modeling for the query in question.
  • It seems that you're updating/deleting way too many data from this table and since you're not querying by the full primary key [partition key(s) + clustering key(s)], you're scanning back over excess of 100K+ tombstones
  • Using a LIMIT 5000 doesn't seem to be useful here and is further aggrevating the tombstone read problem
  • This cluster also is appearing to be not sized properly to take the load as we're seeing the following error, which means that none of the replicas are healthy to serve the client request
ResponseError: Server failure during read query at consistency LOCAL_ONE (1 responses were required but only 0 replicas responded, 1 failed).

As a first step(s) to bandaid this problem, try using the full primary key as part of the read query and reduce the limit size such that you can help alleviate the tombstone problem. Next, attempt doing the nodetool garbagecollect on the cluster nodes to remove the tombstones. Since I do not know the version of C*, I'm not sure if this command is supported in the version that you're running, so validate that. Next, perform a thorough end-to-end data modeling exercise to understand the use case and build an efficient table schema that can scale seamlessly.

1
stevenlacerda On

As @madhavan shared, there are a lot of open questions, but the first thing that came to my mind is that you're importing nulls from your csv file. When you import nulls, you're going to create tombstones. Also, it's not just null, empty strings, characters that can't be read, etc. They all turn into null when you import them and that will create tombstones.

Check out this link:

cassandra copy makes empty string null on reimport