HI Folks,
I work at a company with a handful of windows servers in production running our software as a service offering. Part way through making a bunch of changes such as creating new Databases, moving database files round, setting up logins and enabling/disabling windows services. I'm not the only person who makes changes and it's quite possible that if something goes wrong the person who'll be investigating it won't be aware of what changes have been applied recently. For the most part I don't think it's a problem as we're pretty cautious about changing anything, changes usually occur at defined times (when we're upgrading our own software etc) and problems are usually fairly easily tracked down.
However, it does occur to me that recording what changes people made, when and why might be useful, if not for tracking down issues, then for rebuilding those machines should we need to some time. How do others handle this?
You need bureaucracy like a CMDB ! , maybe.. but it's no silver bullet. The cheapest tool you can start with is MS word, or a wiki.
Servers in production need to be under change control, changes shouldn't be happening willy nilly.
You have to decide on the right level of bureaucracy for your business.
Why are there multiple people making changes to prod in such a small environment? It may be time to introduce a clear separation of roles & have one production person that that has admin access & rolls out all changes.
For rebuilding machines you can do something simple like a 'build guide' if you create lots of servers that are slightly different, then have a generic guide & fill in the blanks for specific servers.
You should also have your disaster recovery plan documented, so that the business knows what to do if a server / data is lost.
Instituting a thorough deployment procedure, including use of a CMBD of some kind (whether the C is for Change or Configuration), is a great first step. Nick definitely covered it well in his answer. Process and procedure helps with the legitimate, intentional changes.
I would also recommend looking into a configuration monitoring tool such as Tripwire. These kinds of systems make use of your C(onfiguration)MDB and will alert whenever a device deviates. Not only will it help the sysadmins, since it detects the case where somebody unintentionally turns off HA when provisioning a new VM; but also will make your security folks happy when a rogue Domain Admin adds his buddy to the financials group.