I have over 200 computers which can provide IPMI services. The servers are manufactured by several different companies (SuperMicro, Dell, etc.), and there are 6-7 BMC models from about 5 different vendors, and each model has it's own idiosyncrasies.
So far we have been configuring the BMCs by using a combination of DHCP and manually configuring each BMC. The manual configuration might be done using a bootable CD-ROM, configuration from the BIOS (If supported), from the host operating system with a utility like ipmitool, freeipmi, etc. or remotely using ipmitool if we can determine the network address of the device.
However, this manual configuration is rather tedious. In some cases we want to change a setting globally on all BMCs, which requires that an administrator run a command against dozens of boxes. Since the BMCs are provided by different vendors and each model of BMC might have it's own idiosyncrasies, the same command does not always work on all BMCs.
Are there any utilities which allow me to mass configure the BMCs on dozens of boxes? Say that I want to query a parameter on dozens of different BMCs, or change the password, disable HTTP access to the WebUI or disable the infamous cipher zero security hole.
Bonus points for any utility which would allow me to update the BMC firmware, which is necessary to mitigate several security vulnerabilities
I'd probably use Ansible. It's a very simple configuration management / orchestration engine that's far simpler to get started with than Puppet (Puppet used to be my go-to choice for this, but not always now, having discovered Ansible).
The benefit of Ansible here is that it communicates directly over SSH, so you'd be able to get started using just your existing SSH credentials and workflow.
If you're currently configuring your BMCs with ipmitool, you'd be able to do something like:
Define a Hosts file -- This tells Ansible which hosts are in the bmc group (in this case), and which to run stuff on.
And so on... You can also use hostnames in that file, as long as they're resolvable.
Then create a "playbook", which is the set of commands to run on each host in a host-group. You want to have this kind of top-down directory layout:
A playbook has Roles, which are little sections of configuration that you can break down and reuse.
So I'd create a file called
bmc.yml
(All Ansible configuration is in YAML files)Then inside
roles/bmcconfig/tasks/main.yml
you can start listing the commands that are to be run on each host, to communicate with ipmi.When you run the playbook, with
ansible-playbook -i hosts bmc.yml
the commands listed intasks/main.yml
for each role will be executed in top-down order on each host found in thebmc
hostgroup inhosts
group_vars/all
is an interesting file, it allows you to define key-value pairs of variables and values that can be used in your playbooks.so you could define something like
in your
group_vars/all
and as a result, you'd be able to have something like:in the playbook.
You can find out way more information about how to use the "modules" - the components of Ansible that allow you to do stuff, how to write your own :D, and so on at the Ansible Documentation Pages.
I have written a small python tool to run command's on our 1000 machines, (and their bmc's, drac's, ilo's and imm's)
What I did was write a python-framework called vsc-manage where I can run command's that are either sent to the server, or the bmc, and then configured what type of machine needs what command.
I have several classes that combine a mix of these command's,
So for machines with an imm it will ssh to the imm, and run
power off
(in an expect-script kind of way)For our imb blade chassis's it will run this on the chassis
For some dell dracs it will run this on the os (of a master node)
For our newer hp systems that do ipmi (and I see more and more these days) it will run this on the master:
or newer dell systems need
ipmitool -I open
, you might need to play with the protocol a bit.For settings not included in the ipmi standard I have implemented some things from the DMTF SMASH CLP, e.g. turning the locator led on:
All of this in a command line tool that can be run from our laptops, that will connect to the right master node, run the right command for the right node, and return the output, with an additional list of errors if any (based on output on stderr and/or exitcode)
This has proven to be very handy, and adding support for a new class of hardware is relatively easy now (Thanks to the fact that most vendors do fully support ipmi and DMTFSMASHCLP now)
This is not suited for initial configuration (it needs the bmc to have a unique ip and correct gateway, but this is what our vendors need to supply us with on delivery) but can do almost anything else (also run arbitrary commands on the host operating system, and automatially schedule downtime in icinga/nagios when you reboot a node, and/or acknowledge 1000 hosts and services in icinga/nagios at once)
Updating the bmc firmware and adding support for our switches are outstanding issues that are planned.
UPDATE
Since at least some people seemed interested I have given it a last polish today, and open sourced this at https://github.com/hpcugent/vsc-manage
Whilst this is very much targetted towards our own workflow (quattor and/or pbs) I hope it at least can be interesting.
I'm surprised nobody mentioned MAAS (http://maas.io/), which does exactly what you are looking for. It can autoconfigure and manage BMCs, and in addition deploy any OS onto the nodes you have enlisted into the system. It has a Web UI and a RESTful API, and is designed to integrate with any automation system.
When a machine PXE-boots for the first time, MAAS uses in-band IPMI to set up credentials automatically for you. From that point onwards, you can easily remotely boot and shut down a machine.
For more details, check the MAAS BMC Power Types documentation that shows how to manually configure a BMC for any node enlisted in MAAS.