With VMFS5 no longer having the 2TB limit for a VMFS volume, I'm considering which scenario would be more beneficial overall:-
- Less LUNs of larger size, or
- More LUNs of smaller size.
In my case, I have a new 24-disk storage array with 600GB disks. I'll be using RAID10, so roughly 7.2TB, and trying to decide whether to go with 1 big 7TB datastore, or multiple stores of 1TB each.
What are the pros and cons of each approach?
Update: Of course, I neglected to include hot spares in my calculation, so it'll be just under 7.2TB, but the general idea is the same. :-)
Update 2: There are 60 VMs and 3 hosts. None of our VMs are particularly I/O intensive. Most of them are web/app servers, and also things like monitoring (munin/nagios), 2 Windows DCs with minimal load, and so on. DB servers are rarely virtualised unless they have VERY low I/O requirements. Right now I think the only virtual DB server we have is an MSSQL box and the DB on that box is <1GB.
Update 3: Some more info on the array and FC connectivity. The array is an IBM DS3524, 2 controllers with 2GB cache each. 4x 8Gbit FC ports per controller. Each ESXi host has 2x 4Gbit FC HBAs.
You didn't specify how many VMs you have or what they're going to be doing. Even without that information, I'd avoid one making one big LUN for blocksize/performance, contention and flexibility reasons.
I will assume you are going to virtualize servers, not desktops, all right? Next I'm going to assume that you are going to use several ESX/ESXi servers to access your storage and have them managed by vCenter Server.
When deciding on LUN size and the number of VMFS you are balancing several factors: performance, configuration flexibility, and resource utilisation, while bound by supported maximum configuration of your infrastructure.
You could get the best performance with 1 VM to 1 LUN/VMFS mapping. There is no competition between machines on the same VMFS, no locking contention, each load is separated and all is goood. The problem is that you are going to manage an ungodly amount of LUNs, may hit supported maximum limits, face headaches with VMFS resizing and migration, have underutilized resources (those single percentage point free space on VMFS adds up) and generally create a thing that is not nice to manage.
The other extreme is one big VMFS designated to host everything. You'll get best resources utilization that way, there will be no problem with deciding what do deploy where and problems with VMFS X being a hot spot, while VMFS Y is idling. The cost will be the aggregated performance. Why? Because of locking. When one ESX is writing to a given VMFS, other are locked away for the time it takes to complete IO and have to retry. This costs performance. Outside playground/test and development environments it is wrong approach to storage configuration.
The accepted practice is to create datastores large enough to host a number of VMs, and divide the available storage space into appropriately sized chunks. What the number of VMs is depends on the VMs. You may want a single or a couple of critical production data bases on a VMFS, but allow three or four dozen of test and development machines onto the same datastore. The number of VMs per datastore also depends on your hardware (disk size, rpm, controllers cache, etc) and access patterns (for any given performance level you can host much more web servers on the same VMFS than mail servers).
Smaller datastores have also one more advantage: they prevent you physically from cramming too many virtual machines per datastore. No amount of management pressure will fit an extra terabyte of virtual disks on a half-a-terabyte storage (at least until they hear about thin provisioning and deduplication).
One more thing: When creating those datastores standardize on a single block size. It simplifies a lot of things later on, when you want to do something across datastores and see ugly "not compatible" errors.
Update: DS3k will have active/passive controllers (i.e. any given LUN can be served either by controller A or B, accessing the LUN through the non-owning controller incurs performance penalty), so it will pay off to have an even number of LUNs, evenly distributed between controllers.
I could imagine starting with 15 VMs/LUN with space to grow to 20 or so.
The short answer to your question is: it all depends on what your IO patterns are, and this will be unique to your environment.
I suggest you have a look here http://www.yellow-bricks.com/2011/07/29/vmfs-5-lun-sizing/ as this may help you consider your anticipated IOPS and how many LUNS might be suitable. That said, if your were to err on the side of caution, some people would advise having many LUNS (If my correction to a previous answer is approved, see my comments re LUN IO queues on the array side). I tend to agree, but would go further to then extent them together into a single/few VMFS volumes (don't believe the FUD about extents, and other VMFS limits http://virtualgeek.typepad.com/virtual_geek/2009/03/vmfs-best-practices-and-counter-fud.html). This will have the benefit of managing a single/few datastores within vSphere and, since vSphere automatically balances VMs over the available extents starting with the first block of each extent, the performance benefit spreading your IO over multiple LUNs.
Something else to consider... You say none of the VMs are particularly IO intensive. Given this, you may like to consider a combination of RAID5 and RAID10, to get the best of both worlds (space and speed).
Further, if you have your VMs configured with multiple VMDKs, with the OS and application IO patterns spread across those virtual disks (ie. OS, web, DB, logs, etc each on a separate VMDK), you can then locate each VMDK on a different datastore to match the IO abilities of that physical LUN (eg. OS on RAID5, Logs on RAID10). Its all about keeping similar IO patterns together to take advantage of the mechanical behaviour of the underlying disks so that, for example, log writes in one VM don't impact your web read rates in another VM.
FYI... you can successfully virtualise DB servers, you just need to analyse the IO patterns & IOPS rates and target that IO to a suitable LUN; all the while being aware of the IO patterns and IOPS that that LUN is already doing. This is why many admins blame virtualiseation for poor DB performance... cos they didn't carefully calculate the IO/IOPS that multiple servers would generate when they put them on a shared LUN (ie. it the admins' fault, not virtualisation's fault).
Each volume (LUN) has its own queue depth, so to avoid IO contention, a lot of implementations use more smaller LUNs. That said, you can make a datastore span LUNs quite easily. The disadvantage of larger (and fewer) VMWare datastores is, as far as I know, that you can run into some limits on the number of VMs that could be on simultaneously.
Another consideration is controller performance. I don't know your SAN specifically, but most if not all SANs required that a LUN is owned by a single controller at a time. You want to have enough LUNs on the system so that you can balance out your workload between controllers.
For example, if you had only one LUN, you'd only be using one active controller at a time. The other would sit idle as it would have nothing to do.
If you had two LUNs, but one was much busier than the other, you'd be using both controllers but not equally. As you add more LUNs the controllers share the workload more evenly.
Specific advice for you:
Your VMs are all relatively the same in terms of performance requirements. I'd create two LUNs, one per controller. Start putting VMs on the first LUN, and measure your I/O latency and queue depth over time as the VMs settle in. Don't use the second LUN yet. Continue filling up LUN 1. You will either reach a point where you start to see performance indicators that the LUN is full, or you will have migrated half of your VMs to that LUN, and still have maintained performance.
If you do see performance issues, I'd remove 30% of the VMs from LUN 1 and move them to LUN 2. Then start filling up LUN 2 in the same manner. Go to LUN 3 if necessary... does that make sense? The idea is to achieve maximum VM density on a given LUN along with a roughly 30% overhead for fudge room.
I would also make a pair of "high performance" LUNs for any heavy hitting VMs. Again, one per controller to share the workload.
Based on your explanations above, you should be fine with 1 datastore. 60 vm's over 3 hosts isn't that bad (20:1). I would, however, recommend you upgrade the HBA's to 8Gb if financially possible on at least one host provided your fiber switch is an 8Gb switch at a minimum.
That being said, I'd make at least 2 if not 3 datastores on the array. 1 datastore per host, with all servers accessing the others for vMotion. I don't know much about the IBM array; with EMC I would create a single RAID10 group with 3 LUNs for each datastore. With this setup, the host with the 8Gb HBAs would be great for your higher I/O systems.
You can do 1 datastore/server, and there are some instances I do that myself, but I only do that with SAN replication on special servers. Managing 87 different datastores over 9 servers for vMotion gets confusing when you set it up! The majority of my vm's are in shared datastores with 5-10 servers on them depending on how much space they need.
One final note. If you have any kind of servers in a failover pair/cluster or load balanced, you're going to want them in different datastores. You don't want the datastore to fail, and then be left with no servers. Admittedly, if you lose the array, you've lost everything, but this is why we backup, right?