I have a storage unit with 2 backplanes. One backplane holds 24 disks, one backplane holds 12 disks. Each backplane is independently connected to a SFF-8087 port (4 channel/12Gbit) to the raid card.
Here is where my question really comes in. Can or how easily can a backplane be overloaded? All the disks in the machine are WD RE4 WD1003FBYX (black) drives that have average writes at 115MB/sec and average read of 125 MB/sec
I know things would vary based on the raid or filesystem we put on top of that but it seems to be that a 24 disk backplane with only one SFF-8087 connector should be able to overload the bus to a point that might actually slow it down?
Based on my math, if I had a RAID0 across all 24 disks and asked for a large file, I should, in theory should get 24*115 MB/sec which translates to 22.08 GBit/sec of total throughput.
Either I'm confused or this backplane is horribly designed -- at least for a performance-based environment.
I'm looking at switching to a model where each drive has it's own channel from the backplane (and new HBA's or raid card).
EDIT: more details
We have used both pure linux (centos), open solaris, software raid, hardware raid, EXT3/4, ZFS.
Here are some examples using bonnie++
4 Disk RAID-0, ZFS
WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS
194MB/s 19% 92MB/s 11% 200MB/s 8% 310/sec
194MB/s 19% 93MB/s 11% 201MB/s 8% 312/sec
--------- ---- --------- ---- --------- ---- ---------
389MB/s 19% 186MB/s 11% 402MB/s 8% 311/sec
8 Disk RAID-0, ZFS
WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS
324MB/s 32% 164MB/s 19% 346MB/s 13% 466/sec
324MB/s 32% 164MB/s 19% 348MB/s 14% 465/sec
--------- ---- --------- ---- --------- ---- ---------
648MB/s 32% 328MB/s 19% 694MB/s 13% 465/sec
12 Disk RAID-0, ZFS
WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS
377MB/s 38% 191MB/s 22% 429MB/s 17% 537/sec
376MB/s 38% 191MB/s 22% 427MB/s 17% 546/sec
--------- ---- --------- ---- --------- ---- ---------
753MB/s 38% 382MB/s 22% 857MB/s 17% 541/sec
Now 16 Disk RAID-0, it's gets interesting
WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS
359MB/s 34% 186MB/s 22% 407MB/s 18% 1397/sec
358MB/s 33% 186MB/s 22% 407MB/s 18% 1340/sec
--------- ---- --------- ---- --------- ---- ---------
717MB/s 33% 373MB/s 22% 814MB/s 18% 1368/sec
20 Disk RAID-0, ZFS
WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS
371MB/s 37% 188MB/s 22% 450MB/s 19% 775/sec
370MB/s 37% 188MB/s 22% 447MB/s 19% 797/sec
--------- ---- --------- ---- --------- ---- ---------
741MB/s 37% 376MB/s 22% 898MB/s 19% 786/sec
24 Disk RAID-0, ZFS
WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS
347MB/s 34% 193MB/s 22% 447MB/s 19% 907/sec
347MB/s 34% 192MB/s 23% 446MB/s 19% 933/sec
--------- ---- --------- ---- --------- ---- ---------
694MB/s 34% 386MB/s 22% 894MB/s 19% 920/sec
(anyone starting to see the pattern here?) :-)
28 Disk RAID-0, ZFS
WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS
358MB/s 35% 179MB/s 22% 417MB/s 18% 1105/sec
358MB/s 36% 179MB/s 22% 414MB/s 18% 1147/sec
--------- ---- --------- ---- --------- ---- ---------
717MB/s 35% 359MB/s 22% 832MB/s 18% 1126/sec
32 Disk RAID-0, ZFS
WRITE CPU RE-WRITE CPU READ CPU RND-SEEKS
354MB/s 35% 190MB/s 22% 420MB/s 18% 1519/sec
354MB/s 35% 190MB/s 22% 418MB/s 18% 1572/sec
--------- ---- --------- ---- --------- ---- ---------
708MB/s 35% 380MB/s 22% 838MB/s 18% 1545/sec
More details:
Here is the exact unit:
http://www.supermicro.com/products/chassis/4U/847/SC847E16-R1400U.cfm
Without knowing the exact hardware you're using, the max you can get through two SAS SFF-8087 is 24Gbps, or 3 GBps; but many controllers-expander combinations will not actually use all 4 channels in the SFF-8087 correctly and you end up getting approximately a single link (0.75GBps).
Considering your performance numbers, I would venture a guess that the latter is the case.
I was thinking about getting this same unit, but now considering the performance you are getting I better think it twice.
On the other hand, what raid controller are you using ? Because I've read somewhere else that those LSI backplanes don't work too well with non-LSI raid cards.
Regarding the theoretical performance: For the 24 drive backplane, you should have (SAS2) 6 gbit x 4 = 24 Gbit what is 1 Gbit per disk. Using the same math, you should get 2 Gbit per disk with the other backplane. Now, 1Gbit per disk stands for... 80 MB/s ? So, 2Gbit would be more than enough for the disk to become the bottleneck. So:
(80 MB/s * 24) + ( 125 MB/s * 12) = 3420 MB/s
I know this is only theoretical and nobody would expect these numbers in real world... but you are getting ~10%. You better check this issue with either Supermicro or LSI because it's very weird.
The average disk speeds you quote are probably for highly sequential operations at the best conditions. If your workload is random, speak a lot worse (I mean it).
SFF-8087 cables have only 4 channels and you could essentially overload them with 4 disk reading/writing at maximum speed. The point is, you are usually not doing that all the time and that's why SuperMicro uses the LSI expanders in the E1/E2 versions of their chassis.
If you need maximum performance all the time, you'll need to connect each and every disk to a single SAS port on your controllers. Most controllers have 4, 8 or 16 ports so you've to do the math and add more controllers to support 36 disks. SuperMicro chassis in the TQ version allow you to have that kind of access to your disks.
Raw IOPS performance varies depending on what device you're focusing. LSI HBAs usually have >250k IOPS specs while SATA disks will do 100-120, SAS a bit more and average Intel SSDs 3000 IOPS (highly dependent on your block size).
You have to understand the workload you'll be throwing at this machine otherwise you'll overprovision the wrong resources. Focusing on raw sequential speed with each disk having its own channel to the controller won't help if your workload is highly random and what you need is more IOPS from the disk/SSDs.