Unable to add nodes to existing Cassandra Cluster

952 Views Asked by At

We have cassandra cluster of 6 nodes on EC2,we have to double its capacity to 12 nodes. So to add 6 more nodes i followed the following steps.

1 Calculated the tokens for 12 nodes and configured the new nodes accordingly.

2 With proper configuration started the new nodes so that they new nodes will bisect the existing token ranges.

  • In the beginning all the new nodes were showing the streaming in progress.
  • In ring status all the node were in "Joining" state
  • After 12 hours 2 nodes completed the streaming and came into the normal state.
  • But on the remaining 4 nodes after streaming some amount of data they are not showing any progress , look like they are stuck

We have installed Cassandra-0.8.2 and have around 500 GB of data on each existing nodes and storing data on EBS volume.

How can i resolve this issue and get the balanced cluster of 12 nodes?

Can i restart the nodes?

If i cleaned the data directory of stuck Cassandra nodes and restarted with fresh installation, will it cause any data loss?

3

There are 3 best solutions below

0
On BEST ANSWER

There will not be any data loss if you replication factor 2 or greater.

Version 0.8.2 of Cassandra has several known issues - please upgrade to 0.8.8 on all original nodes as well as the new the nodes that came up and then start the procedure over for the nodes that did not complete.

Also, be aware that storing data on EBS volumes is a bad idea :

http://www.mail-archive.com/[email protected]/msg11022.html

0
On

While this won't answer your question directly, hopefully it points you in the right direction:

There is a fairly active #cassandra IRC channel on freenode.org.

0
On

So here is the answer why our some of the nodes were stuck.

1) We have upgraded from cassandra-0.7.2 to cassandra0.8.2

2) And we are loading the sstables with sstable-loader utility

3) But some data for some of the column families are directly inserted from hadoop job. And the data of these column families are showing some other version as we have not upgraded the cassandra api in hadoop.

4) Because of this version mismatch cassandra throws 'version mismatch exception' and terminate the streaming

5) So the solution for this is to use "nodetool scrub keyspace columnfamily". I have used this and my issue is resolved

So the main thing here is if you are upgrading the cassandra cluster capacity u must do the nodetool scrub