At our organization, we're looking at implementing some sort of informal internal policy for server maintenance. What we're looking at doing is completing maintenance on our entire server pool every two months; each month we'll do half of the servers.
What I'm trying to figure out is some way to split the servers into the two groups. Our naming convention isn't much to be desired (but getting better) so by name or number doesn't really work.
I can easily take a list of all the servers and split them in two, but with new servers are being added constantly, and old ones retired, that list would be a headache to maintain. I'd like to look at any given server and know if it should have its maintenance done this month or next.
For example, it would be nice to look at the serial number. If it started with an even number, then it gets maintenance done on even months and vice-versa. This example won't work though as a little over half of the servers are virtual.
Any ideas?
Well I guess a policy like 'even server serial numbers this month, odd next month' could work fine, it seems like a lot of work to me to maintain that list. If you could query all your servers for their serial number over the network you could automate this process, which would make it manageable.
I guess I don't really understand this concept of 'monthly maintenance'. What sorts of things do you need to batch up like this and do manually? Are you talking about applying OS patches or firmware updates or what? OS and software patches you should automate as much as possible and probably do on a fiarly continuous basis. For firmware or bios updates I can see how you would want to group servers, since you have to reboot machines and that sort of thing is a fairly manual process no matter what.
How about this: just name all your machines with a prefix and start at 1. Guesstimate how many servers you might ever have and pad with zeros. Thus your first machine is
company-00001
, next iscompany-0002
, etc. You should probably assign functional names for regular use, for example use extra dns records to assigncompany-00001
web-00001
.Then, just look at the last number of the
company-
name to determine if machine is odd or even, and group them by month that way.Our server names all contain a number (e.g. site-service##) and then we group servers by even and odd. Updates and patches are then applied to the even or odd group only. We even use this system to balance power feeds and maintain network gear.
It's a relatively simple system that allows us to easily document changes like "Update XYZ applied to odd web-front ends today."
Of course, this may not be helpful for you if your starting in an environment where servers are named after Disney characters or something else irrelevant.
There are an almost infinite number of ways to do this and the suggestions here are going to be subjective. Here are my suggestions:
Split the servers based on function
Split the servers based on "form factor": virtual VS. physical
Split the virtual servers by host