Ping a Specific Port

Question

Hanno Fietz

Asked: 2009-07-14 05:28:08 +0800 CST2009-07-14 05:28:08 +0800 CST 2009-07-14 05:28:08 +0800 CST

How can I shut down (power off) cluster nodes during low load?

772

I'm developing software for the energy consulting business and in monitoring energy use in datacenters, I've noticed that the typical electric load "pattern" of a datacenter is just a flat line, because all the gear runs 24/7. If you compare this to the actual usage pattern (network load, CPU usage etc), which we did, you regularly have long stretches with little usage but the full capacity available.

These patterns are very predictable in many cases and to save energy, it would be great to turn off part of the equipment (servers, switches, storage) regularly or in low-load conditions. However, I can think of several aspects that would have to be looked at, including

handling peak loads or sudden spikes
data consistency among nodes
long startup (and, possibly, synchronization) times compared to average uptime of a node

There's probably more. Is there software that handles such a scenario and what else should be looked out for? Is this a viable suggestion to make?

For my purposes, a cluster wouldn't necessarily mean to cluster machines on the OS level, identical hosts that receive requests via a load balancer (i. e. application level clustering) would also count. I'm not sure how MySQL cluster or similar work, but I'd probably count those as well.

I'm looking for advice for any operating system.

See also my post on energy efficiency over at Stack Overflow that brought up this question.

7 Answers

Voted

Kevin Kuphal · Answer 1 · 2009-07-14T08:38:34+08:00

Kevin Kuphal

2009-07-14T08:38:34+08:002009-07-14T08:38:34+08:00

VMware

The latest version of their enterprise product, VSphere 4, can power down hosts that are not needed to meet capacity and wake them up when needed by distributing the virtual machines in real-time. Combine this with the power/energy savings that you get from consolidating your hardware on a virtualized platform and you can get a significant power savings.

4

Coops · Answer 2 · 2009-07-14T08:44:56+08:00

Coops

2009-07-14T08:44:56+08:002009-07-14T08:44:56+08:00

This was mentioned over on Planet Ubuntu just today. The post can be found here. It talks about the development of a practical solution to power up/down machines on demand in a cloud using PowerNap.

3

Michael Henry · Answer 3 · 2009-07-14T06:56:25+08:00

This question has a million answers and most of them won't be right for you.

It's operating system specific, hardware specific and load specific.

If this solution is to be responsive ie. quick to reduce power usage and quick to come back you should look at hardware with ACPI sleep functionality, rather than shutting down. As mentioned above wakeonlan will only work properly when the hardware is in a sleep state.

The 2nd part of this problem is control. When to put a system to sleep and when to wake it back up again. Without knowing how your cluster manages workload, you really won't get an answer to this.

Personally i run a webfarm that has a load balancer in front. Traffic is directed to a pair of hosts up until a certain level and then it round robins the rest. When those other servers don't show any activity for an hour or after 18:00 hrs they are put into a sleep state. When the snmp scripts show that user volume is ramping up on the load balancer, those sleeping hosts are sent a wakeonlan magic packet and the cluster comes back to full strength. It could be more fine grained, but i can really only experiment in live so it's been little moves, that i'm confident in.

Cheers M.

lukecyca · Answer 4 · 2009-07-14T08:36:07+08:00

Power

Use Switched PDUs so that you can turn servers and switches on and off out-of-band. This is OS- and device-independent, which will greatly simplify the configuration and logic that powers things on and off. If your servers all have network-enabled IPMI interfaces, you can use those instead. I would recommend against trying to turn things on and off using higher-level things like wake-on-LAN.

Power up/down Logic

This could take many forms. Some clustering software (such as Moab) has a solution for this built in. Otherwise, you can write some script with the following pseudocode:

Check overall cluster load
If cluster load > threshold1, turn on some nodes
If cluster load < threshold2, turn off some nodes

Put that in cron and have it run every half hour.

Clustering Software Stack

Obviously, you'll need to make sure your clustering software stack can deal with these devices going up and down all the time. Do a lot of testing here, consider obscure timing issues (booting takes time) and any race conditions that will creep up in the power up/down logic you use.

Eric Petroelje · Answer 5 · 2009-07-14T06:31:50+08:00

Eric Petroelje

2009-07-14T06:31:50+08:002009-07-14T06:31:50+08:00

Well, for servers the SHUTDOWN.EXE command can be used to remotely shut down a windows box. The same thing could easily be done on Unix with a telnet/ssh script.

The bigger issue would be how to start them back up again. You'd need Wake-on-LAN or something similar for that.

The hard part about doing this is in verifying that the machines you are shutting down aren't actually doing something important. Like that cron job that nobody was really sure where it was supposed to go, so they just put it on one of the clustered web servers. Now you shut that machine down and the job doesn't run anymore like it was supposed to.

If the environment is tightly controlled though and you know exactly what each machine is doing, it would make a lot of sense.

0

conny · Answer 6 · 2010-01-12T08:26:44+08:00

conny

2010-01-12T08:26:44+08:002010-01-12T08:26:44+08:00

Powering on and off machines remotely really ought not be a problem today, since practically all server hardware implement IPMI, and getting started with the tools is quite easy.

WoL is good in other use cases, such as when your desktop computer has gone to sleep and you want it to wake up before the backup jobs are run.

There is no standard interface for "sleep-on-LAN". IPMI was designed to solve these sorts of problems, hence gives you more consistency and better control.

(update: note that you can probably use WoL to wake up in case you have used dm-suspend to take a nap instead of having shut down... Could make for an interesting compromise.)

(Note to search engine: I would have found out about this thread earlier, had it had a title more like "Automated, load adaptive power cycling of cluster nodes")

0

pfo · Answer 7 · 2010-01-23T09:03:11+08:00

pfo

2010-01-23T09:03:11+08:002010-01-23T09:03:11+08:00

Sun's SGE (Sun Grid Engine) is a cluster scheduling/batch queueing system which in it's latest release supports power saving by powering off nodes which are not currently needed according to certain queue/work load specification. Keep in mind that this is a HPC-ish special purpose system. Powering off certain parts of a datacenter may be one huge dependency problem.

0

How can I shut down (power off) cluster nodes during low load?

Power

Power up/down Logic

Clustering Software Stack

Ping a Specific Port

What port does SFTP use?

Resolve host name from IP address

How can I sort du -h output by size

Command line to list users in a Windows Active Directory group?

What's the command-line utility in Windows to do a reverse DNS look-up?

How to check if a port is blocked on a Windows machine?

What port should I open to allow remote desktop?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?