I've been running a series of load tests on a dedicated DB SAN from a pre-production cluster (Dell R710 connecting to a dedicated RAID10 SAN over 2 gigabit ethernet connections), and I'm not sure if I'm correctly interpreting the data.
For reference, here's the raw data.
Test 1
sqlio v1.5.SG
using system counter for latency timings, 2727587 counts per second
parameter file used: paramD100.txt
file d:\tmp\testfile.dat with 2 threads (0-1) using mask 0x0 (0)
2 threads reading for 120 secs from file d:\tmp\testfile.dat
using 64KB random IOs
enabling multiple I/Os per thread with 2 outstanding
buffering set to use hardware disk cache (but not file cache)
using specified size: 20480 MB for file: d:\tmp\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 372.12
MBs/sec: 23.25
latency metrics:
Min_Latency(ms): 1
Avg_Latency(ms): 10
Max_Latency(ms): 159
Test 2
sqlio v1.5.SG
using system counter for latency timings, 2727587 counts per second
parameter file used: paramD100.txt
file d:\tmp\testfile.dat with 2 threads (0-1) using mask 0x0 (0)
2 threads reading for 120 secs from file d:\tmp\testfile.dat
using 64KB random IOs
enabling multiple I/Os per thread with 2 outstanding
buffering set to use hardware disk cache (but not file cache)
using specified size: 20480 MB for file: d:\tmp\testfile.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 358.26
MBs/sec: 22.39
latency metrics:
Min_Latency(ms): 1
Avg_Latency(ms): 10
Max_Latency(ms): 169
In order to reduce the difference between test results, these tests were run at 11:30am on 2 consecutive days.
Given this load pattern, should I be expecting as low a MBPS throughput as I'm getting, or am I interpreting this correctly and believe that there's either an issue with the network or the SAN (or the whole lot)?
Thanks.
Update #1
To give specifics, the setup is as follows.
Production DB cluster
Dell R710, with 2 x Broadcom 5709's (iSCSI and TOE offload capable, using Dell's Multipathing IO software). And yes, I've seen the 'Broadcom - die mutha' post :S
Switch
2 Juniper EX4200-48T's acting as a single virtual switch
One connection from each Broadcom NIC on each Cluster connects to one switch And there are 2 Gigabit connections from each Switch to the SAN.
SAN
Dell EqualLogic PS6000E iSCSI SAN, packed out with 16 (14 + 2 hotspare) 2tb 7200rpm drives
As far as I know, and from how I think this should work, we should theoretically be getting 200mbps, which as you can see, we're not.
Update 2
To give a bit more context, here's a graph showing the average mbps for 4 separate runs.
For reference, the Y axis is MBPS, and the X axis is the IO type (random or sequential), Pending IO's and the operation (read vs write).
Images disabled, so here's a link - Graph showing average results for 4 SQLIO runs
There are 2 things concerning me here -
- Firstly, the random read throughput is lower than I'd have expected
- And secondly, random write IO's plateau out at 110mbps, whereas this suggests the array should be capable of more than that.
Is this a roughly expected pattern for this type of setup? And is there anything else that looks out of place or wrong here?
No, you're right, that's not great at all - you don't mention the layout of the SAN but given it's R10 then you'd imagine a worst case scenario of a minimum of 4 x cheapo SATA disks over one of those 1Gbps links (I doubt both will be utilised at the same time due to MAC consistency), even then I'd expect at least double the random read MBps you're seeing, and that's worst case. There's something wrong.
Since you don't mention what the SAN is, I'm going to assume it's generic iSCSI.
What kind of disks are in it ? What speeds ? What cache, if any, is present on the RAID-10 ?
I agree that 22MB/sec is horribly slow, but if it's just 4 SATA disks in RAID-10, the 350-odd IOPS would be about right.
In your dreams. when doing not random IO, where msot time is spend jumping from sector to sector. But with random io and few slow end user discs that sounds about right - welcome to a worls where you get around 300 IOPS for a pair of diskcs, and a SSD gives you 60.000. Now maybe you understand that SSD kiklkl a SAN for random IO, and the MB/S number that end users love osm uch has no relevance for database storage backends.
You also sabotage is:
Ok, given that SATA NCQ (Native Command Queueing) intelligently reorders up to 32 pending requests in teh disc, sending on only 2 at the same time is suboptimal. You get 2x2 = 4 outstanding, but can handle 32 per disc.
Taht said, at the end you need (a) faster discs (b) more of them to get higher IOPS. Or a decent second level cache (Adaptec raid controlelrs can use SSD as read & write cache).