I'm developing software for the energy consulting business and in monitoring energy use in datacenters, I've noticed that the typical electric load "pattern" of a datacenter is just a flat line, because all the gear runs 24/7. If you compare this to the actual usage pattern (network load, CPU usage etc), which we did, you regularly have long stretches with little usage but the full capacity available.
These patterns are very predictable in many cases and to save energy, it would be great to turn off part of the equipment (servers, switches, storage) regularly or in low-load conditions. However, I can think of several aspects that would have to be looked at, including
- handling peak loads or sudden spikes
- data consistency among nodes
- long startup (and, possibly, synchronization) times compared to average uptime of a node
There's probably more. Is there software that handles such a scenario and what else should be looked out for? Is this a viable suggestion to make?
For my purposes, a cluster wouldn't necessarily mean to cluster machines on the OS level, identical hosts that receive requests via a load balancer (i. e. application level clustering) would also count. I'm not sure how MySQL cluster or similar work, but I'd probably count those as well.
I'm looking for advice for any operating system.
See also my post on energy efficiency over at Stack Overflow that brought up this question.
VMware
The latest version of their enterprise product, VSphere 4, can power down hosts that are not needed to meet capacity and wake them up when needed by distributing the virtual machines in real-time. Combine this with the power/energy savings that you get from consolidating your hardware on a virtualized platform and you can get a significant power savings.
This was mentioned over on Planet Ubuntu just today. The post can be found here. It talks about the development of a practical solution to power up/down machines on demand in a cloud using PowerNap.
This question has a million answers and most of them won't be right for you.
It's operating system specific, hardware specific and load specific.
If this solution is to be responsive ie. quick to reduce power usage and quick to come back you should look at hardware with ACPI sleep functionality, rather than shutting down. As mentioned above wakeonlan will only work properly when the hardware is in a sleep state.
The 2nd part of this problem is control. When to put a system to sleep and when to wake it back up again. Without knowing how your cluster manages workload, you really won't get an answer to this.
Personally i run a webfarm that has a load balancer in front. Traffic is directed to a pair of hosts up until a certain level and then it round robins the rest. When those other servers don't show any activity for an hour or after 18:00 hrs they are put into a sleep state. When the snmp scripts show that user volume is ramping up on the load balancer, those sleeping hosts are sent a wakeonlan magic packet and the cluster comes back to full strength. It could be more fine grained, but i can really only experiment in live so it's been little moves, that i'm confident in.
Cheers M.
Power
Use Switched PDUs so that you can turn servers and switches on and off out-of-band. This is OS- and device-independent, which will greatly simplify the configuration and logic that powers things on and off. If your servers all have network-enabled IPMI interfaces, you can use those instead. I would recommend against trying to turn things on and off using higher-level things like wake-on-LAN.
Power up/down Logic
This could take many forms. Some clustering software (such as Moab) has a solution for this built in. Otherwise, you can write some script with the following pseudocode:
Put that in cron and have it run every half hour.
Clustering Software Stack
Obviously, you'll need to make sure your clustering software stack can deal with these devices going up and down all the time. Do a lot of testing here, consider obscure timing issues (booting takes time) and any race conditions that will creep up in the power up/down logic you use.
Well, for servers the SHUTDOWN.EXE command can be used to remotely shut down a windows box. The same thing could easily be done on Unix with a telnet/ssh script.
The bigger issue would be how to start them back up again. You'd need Wake-on-LAN or something similar for that.
The hard part about doing this is in verifying that the machines you are shutting down aren't actually doing something important. Like that cron job that nobody was really sure where it was supposed to go, so they just put it on one of the clustered web servers. Now you shut that machine down and the job doesn't run anymore like it was supposed to.
If the environment is tightly controlled though and you know exactly what each machine is doing, it would make a lot of sense.
Powering on and off machines remotely really ought not be a problem today, since practically all server hardware implement IPMI, and getting started with the tools is quite easy.
WoL is good in other use cases, such as when your desktop computer has gone to sleep and you want it to wake up before the backup jobs are run.
There is no standard interface for "sleep-on-LAN". IPMI was designed to solve these sorts of problems, hence gives you more consistency and better control.
(update: note that you can probably use WoL to wake up in case you have used dm-suspend to take a nap instead of having shut down... Could make for an interesting compromise.)
(Note to search engine: I would have found out about this thread earlier, had it had a title more like "Automated, load adaptive power cycling of cluster nodes")
Sun's SGE (Sun Grid Engine) is a cluster scheduling/batch queueing system which in it's latest release supports power saving by powering off nodes which are not currently needed according to certain queue/work load specification. Keep in mind that this is a HPC-ish special purpose system. Powering off certain parts of a datacenter may be one huge dependency problem.