Ping a Specific Port

Question

Jason Axelson

Asked: 2012-08-24 00:26:34 +0800 CST2012-08-24 00:26:34 +0800 CST 2012-08-24 00:26:34 +0800 CST

Dynamically changing one-node Cassandra cluster to two nodes

772

So I have an application that will be very dormant most of the time but will need high-bursting a few days out of the month. Since we are deploying on EC2 I would like to keep only one Cassandra server up most of the time and then on burst days I want to bring one more server up (with more RAM and CPU than the first) to help serve the load. What is the best way to do this? Should I take a different approach?

Some notes about what I plan to do:

Bring the node up and repair it immediately
After the burst time is over decommission the powerful node
Use the always-on server as the seed node

My main question is how to get the nodes to share all the data since I want a replication factor of 2 (so both nodes have all the data) but that won't work while there is only one server. Should I bring up 2 extra servers instead of just one?

2 Answers

Voted

brain99 · Answer 1 · 2012-08-27T16:54:15+08:00

brain99

2012-08-27T16:54:15+08:002012-08-27T16:54:15+08:00

It seems that you can quite easily change the replication factor.

This is also mentioned on the Cassandra wiki, where you can find instructions for both increasing and decreasing the replication factor.

This means it should be possible to do this:

change replication factor from 1 to 2
bring up and repair your burst node so that it receives a copy of all data
... do work ...
decommision burst node
change replication factor back from 2 to 1
run cleanup

1

CraigJPerry · Answer 2 · 2012-09-04T00:53:50+08:00

Changing the replication factor on the fly doesn't work all that well in my experience :-( You can end up with schema disagreements, which are time consuming to fix, for me at least.

Just thinking out loud but another possible route could be (change timings to suit):

Increase your GC grace period in cassandra.yaml (this determines how long tombstones live before being purged from disk) to say 30 days
Spin up a second node every 15 days or so, whether it's needed or not. Ensure it's data / commit logs etc. are preserved between runs. This will mean you get started quicker when you need to spin up the 2nd node
with more RAM and CPU than the first

Cassandra effectively divides workload by the amount of the ring each node is responsible for. It might be easier to have the 2nd node either double the capacity of the first, or add 2 nodes of the same size as the first, for easier division of the ring.

This will still require manual nodetool intervention when dropping the nodes though as the hinted handoffs will be filling up disk needlessly on the remaining node.

Dynamically changing one-node Cassandra cluster to two nodes

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?