I have a performance problem with a SQL Server data warehouse solution on a blade connected to an IBM Shark. Disk I/O performance seems to be pathologically slow with excessive page latch wait times. Other people working on SQL Server based MIS projects report similar symptoms but there is no apparent solution. Some tests I have tried are:
Building some or all of the volumes on shevles set up as RAID-10. This did not have any discernable effect.
Disabling the forced write-through behaviour on SQL Server (this may be a placebo - see below).
None of the tests significantly affected speed and the system will benchmark approximately twice as fast on a machine of broadly similar hardware and O/S specification (Windows 2003 server EE) with a direct-attached SCSI array. On the SAN I have three volumes with 8 disks each mounted (RAID-10 logs, RAID-10 tempdb, RAID-10 data). On the machine with direct-attach disks I had 6 internal and 14 external 10k SCSI disks.
Note that the Shark had 32GB of cache RAM, which should have been enough to fit the working set of the application as tested.
One avenue I investigated was a SQL Server certification program for SAN storage where the SAN honoured I/O with a forced write-through bit set. I could find documentation of EMC, HP and Hitachi SANs that claimed compliance with this, but could find no such literature for the Shark.
Has anyone encountered similar performance issues on a Shark - if so do you have any insights on this?
If you run through my SAN performance measurement instructions here:
http://sqlserverpedia.com/wiki/SAN_Performance_Tuning_with_SQLIO
Post the results of the text file somewhere on the web, and include a link to it in your question. That should help us figure out what the speed limit is.
Also, how are you connecting between the server and the SAN? How many HBAs are you using, and what kind of multipathing software are you using? If you're only using 2 HBAs or if you're not using multipathing software, you're going to hit a pretty early bandwidth limit. With something like a Shark, you'll want to use at least 4 HBAs and good active/active multipathing software in order to get good throughput.
I've seen this with Sybase and Oracle as well. The Shark hardware just does not perform as well as local disk.
What sort of IO does the shark report is making it to the disk? Does the shark show any queueing going on? What sort of response time does the SAN show?
How much cache does the SAN have? Are other systems using the SAN as well? How is the cache on the SAN laid out?
If you are only looking at Windows to give you numbers back you aren't getting the whole story as the SAN will mask some of these settings.