Warning: This is a very general question.
I've walked into an environment where everything is working for the most part but it is held together with scotch tape (don't want to offend the duct tape cult).
Key points
- Backups can't restore to different hardware
- SAN uses Microsoft iSCSI initiator
- Permissions are a non-documented nightmare
- Many single points of failure
- Servers are mostly 4-5+ years old Poor utilization
- 20 servers across 3 offices
- All windows servers
Should I virtualize first (already have a basic SAN) or address these issue prior to virtualizing? I think virtualizing will make fixing these issues much easier but I want to avoid garbage in garbage out. My biggest concern are the backups with Exchange and SQL servers (with middleware) not being able to restore to different hardware. I'm planning to go VMWare when the time comes.
Your thoughts.... Thanks,
Personally I would look into fixing things up first, then look into improvements in the infrastructure. This way you are not introducing new complexities to the problems.
Let me take a little bit to address the issues you brought up:
This is a MAJOR issue. You should really talk to your backup vendor and figure out why. Is it because they are doing backups that restore to bare metal and the vendor doesn't support bare metal restores unless it is the same hardware? If so you should be able to add a data only backup to the rotation. That way it may be a bit more work to come back up, but you don't lose the important stuff (the data)!
Why do you think this is a problem? There is nothing wrong with the microsoft iSCSI initiator, in fact I would be wary of someone who didn't use that on MS platforms. We have hundreds of boxes using the iSCSI initiator to talk to dozens of SANs without issue.
This ... sucks. And, happens everywhere. You best bet is to slowly chip away at documenting these. Search on this site there are a bunch of questions related to documenting permissions using scripts. But you don't want to go messing around with things before you know how they are right now.
This is always a tricky one. You need buy in from the business to get them to spend the money to reduce or eleminate the SPoF. My best suggestion to you is document everything, and put together a risk analysis. Then put together a few suggested solutions and approximate costs and present it to the business owners. If they want to reduce or eliminate them then you are golden, if not all you can do is keep documenting it, and start documenting outages caused by it and bring it back up to the owners.
There is nothing wrong with this as long as they are still under warranty. If the 4-5 year older servers are under utilized they are good candidates to be virtualized, but you should spend some time doing performance analysis to see where the utilization is - Memory, Network IO, Disk IO, processor, etc - so you can properly plan your Virtulization strategy.
Once again, nothing wrong with this either. You just need to make sure that there are proper remote tools at your disposal - IP KVMs, Remote access power strips, iLO/DRAC cards,etc. In fact depending on the WAN connection, centralizing could reduce performance and manageability. Once again take a look at your use profiles for the servers.
Absolutely nothing wrong with this, changing things because they are windows for the sake of changing them away from windows is a bad bad idea.
So, if I was in your situation I would sit down and make a list of everything you see as needing to be changed, then organize them as most important (i.e. Data loss, Downtime) to least important (i.e. inconvenience, infrastructure improvements). Then you just work down the list fixing things one at a time until it's done.
Virtualization is not a panacea, it may solve some of your problems, but it will introduce new issues and problems along the way. I would think long and hard before jumping in to virtualizing things without a good solid understanding of how things are now as well as how it will change the situation and what new issues it could introduce.
Well, as with nearly all things, it depends. The one big win you get with virtualizing things in your situation is the ability to snapshot VMs. As you're troubleshooting/patching/fixing/etc, the ability to snap could be a godsend. Conversely, the P2V transition could throw another level of instability/unpredictability into the mix. You can always give P2V a try and if things don't work out well, you haven't really lost anything - you can always go back to the physical host.
I would probably fix things while virtualising. Like having 2 simultaneous infrastructures, the "new" one and the "old" one, and migrating things one by one.
"If all you have is a hammer, everything looks like a nail."
Reevaluate why you'd want to virtualize the infrastructure. What problems are you encountering that would warrant virtualization? I personally would not virtualize the infrastructure unless you really really need the extra hardware that would be freed up for other things. You'd be introducing another issue as well: what if the hardware hosting the hypervisor dies? You might say, "I'll just use HA across two machines with VM's on them!" But what does that buy you over, say, building highly available services with services simply installed on top of a regular Windows installation?