I bring up, yet again, the ever-present question of how to best optimize disk structures. In my organization, we have a 14TB Linux software RAID array dedicated to storing backups made using Symantec Backup Exec. These are large files, 10GB - 100GB each, with some supporting metadata files a couple KB in size. Long story short, we have to recreate the array, and I would like to know the optimal array chunk size for this use case.
Details of our setup:
A Netgear ReadyNAS Pro, running a clean & updated install of CentOS 6.4.
6 x 3TB consumer (SATA II, 7200 RPM) hard drives from assorted vendors (identical in size).
Each drive has 3 identical partitions which form 3 software RAID devices:
- /dev/md0: 6 x 32GB for / in a RAID6
- /dev/md1: 6 x 4GB swap in a RAID10
- /dev/md2: 6 x 2.7TB storage in a RAID5 for ~14TB total useful storage
Additionally, there is an integrated 128MB flash device set up as /boot
/dev/md2 is the array I'm focused on. It is made available as drive "R:" to a Windows Server 2008 R2 box running Symantec Backup Exec via multipath iSCSI over dual gigabit NICs on both machines (also running 9k jumbo frames).
On the Server 2008 box, R: is formatted as NTFS with a 64k cluster size, and is dedicated to storing backup files. The average file is generally between 40MB and 5GB, depending on the current proportion of full vs incrementals/differentials present. Disk usage is about a 50/50 split between read and write, as we mirror backups from this drive to tape as well.
Overall, given the hardware, I've think optimized this setup fairly well, however I'm not a storage expert, and the implications of the RAID chunk size are slightly beyond me. I know the default mdadm
chunk size is 512KB. Is this optimal for my scenario? Should I adjust this to match NTFS cluster size? Or is there some magic formula I've missed?
Thanks for any help you can provide.
Edit: Benchmark results below. Not all combinations were tested.
########## 4K Chunk##########
-----------------------------------------------------------------------
CrystalDiskMark 3.0.2 x64 (C) 2007-2013 hiyohiyo
Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s]
Sequential Read : 111.551 MB/s
Sequential Write : 96.759 MB/s
Random Read 512KB : 107.033 MB/s
Random Write 512KB : 56.770 MB/s
Random Read 4KB (QD=1) : 9.500 MB/s [ 2319.2 IOPS]
Random Write 4KB (QD=1) : 5.042 MB/s [ 1231.0 IOPS]
Random Read 4KB (QD=32) : 101.717 MB/s [ 24833.3 IOPS]
Random Write 4KB (QD=32) : 8.237 MB/s [ 2010.9 IOPS]
Test : 1000 MB [R: 0.0% (0.1/13791.8 GB)] (x5)
Date : 2013/07/12 13:10:31
OS : Windows Server 2008 R2 Enterprise Edition (Full installation) SP1
[6.1 Build 7601] (x64)
########## 32K Chunk##########
-----------------------------------------------------------------------
CrystalDiskMark 3.0.2 x64 (C) 2007-2013 hiyohiyo
Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s]
Sequential Read : 91.276 MB/s
Sequential Write : 11.119 MB/s
Random Read 512KB : 0.000 MB/s
Random Write 512KB : 0.000 MB/s
Random Read 4KB (QD=1) : 0.000 MB/s [ 0.0 IOPS]
Random Write 4KB (QD=1) : 0.000 MB/s [ 0.0 IOPS]
Random Read 4KB (QD=32) : 0.000 MB/s [ 0.0 IOPS]
Random Write 4KB (QD=32) : 0.000 MB/s [ 0.0 IOPS]
Test : 1000 MB [R: 0.0% (0.1/13791.8 GB)] (x5)
Date : 2013/07/12 14:37:05
OS : Windows Server 2008 R2 Enterprise Edition (Full installation) SP1
[6.1 Build 7601] (x64)
########## 64K Chunk##########
-----------------------------------------------------------------------
CrystalDiskMark 3.0.2 x64 (C) 2007-2013 hiyohiyo
Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s]
Sequential Read : 111.968 MB/s
Sequential Write : 103.318 MB/s
Random Read 512KB : 105.047 MB/s
Random Write 512KB : 48.321 MB/s
Random Read 4KB (QD=1) : 10.373 MB/s [ 2532.5 IOPS]
Random Write 4KB (QD=1) : 5.180 MB/s [ 1264.5 IOPS]
Random Read 4KB (QD=32) : 95.106 MB/s [ 23219.3 IOPS]
Random Write 4KB (QD=32) : 9.108 MB/s [ 2223.6 IOPS]
Test : 1000 MB [R: 0.0% (0.1/13791.8 GB)] (x5)
Date : 2013/07/12 12:47:37
OS : Windows Server 2008 R2 Enterprise Edition (Full installation) SP1
[6.1 Build 7601] (x64)
########## 128K Chunk##########
-----------------------------------------------------------------------
CrystalDiskMark 3.0.2 x64 (C) 2007-2013 hiyohiyo
Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s]
Sequential Read : 111.908 MB/s
Sequential Write : 94.305 MB/s
Random Read 512KB : 104.772 MB/s
Random Write 512KB : 43.821 MB/s
Random Read 4KB (QD=1) : 9.247 MB/s [ 2257.6 IOPS]
Random Write 4KB (QD=1) : 4.929 MB/s [ 1203.3 IOPS]
Random Read 4KB (QD=32) : 101.764 MB/s [ 24844.8 IOPS]
Random Write 4KB (QD=32) : 7.949 MB/s [ 1940.6 IOPS]
Test : 1000 MB [R: 0.0% (0.1/13791.8 GB)] (x5)
Date : 2013/07/12 13:52:01
OS : Windows Server 2008 R2 Enterprise Edition (Full installation) SP1
[6.1 Build 7601] (x64)
########## 512K Chunk##########
-----------------------------------------------------------------------
CrystalDiskMark 3.0.2 x64 (C) 2007-2013 hiyohiyo
Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s]
Sequential Read : 110.237 MB/s
Sequential Write : 93.149 MB/s
Random Read 512KB : 104.892 MB/s
Random Write 512KB : 41.407 MB/s
Random Read 4KB (QD=1) : 6.760 MB/s [ 1650.3 IOPS]
Random Write 4KB (QD=1) : 3.539 MB/s [ 864.0 IOPS]
Random Read 4KB (QD=32) : 101.139 MB/s [ 24692.3 IOPS]
Random Write 4KB (QD=32) : 7.166 MB/s [ 1749.6 IOPS]
Test : 1000 MB [R: 0.0% (0.1/13791.8 GB)] (x5)
Date : 2013/07/12 12:22:58
OS : Windows Server 2008 R2 Enterprise Edition (Full installation) SP1
[6.1 Build 7601] (x64)
##########1024K Chunk##########
-----------------------------------------------------------------------
CrystalDiskMark 3.0.2 x64 (C) 2007-2013 hiyohiyo
Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s]
Sequential Read : 112.327 MB/s
Sequential Write : 92.353 MB/s
Random Read 512KB : 107.015 MB/s
Random Write 512KB : 39.793 MB/s
Random Read 4KB (QD=1) : 9.536 MB/s [ 2328.0 IOPS]
Random Write 4KB (QD=1) : 3.671 MB/s [ 896.3 IOPS]
Random Read 4KB (QD=32) : 101.990 MB/s [ 24900.0 IOPS]
Random Write 4KB (QD=32) : 0.000 MB/s [ 0.0 IOPS]
Test : 1000 MB [R: 0.0% (0.1/13791.8 GB)] (x5)
Date : 2013/07/12 14:17:08
OS : Windows Server 2008 R2 Enterprise Edition (Full installation) SP1
[6.1 Build 7601] (x64)
At a minimum, you want the chunk size to be a multiple or divisor of the filesystem block size. You've got that.
Everything else is likely to be implementation dependent. Since you're starting from scratch, you should roll your own benchmarks. Instead of creating a 14 TB RAID set, test with just 500 GB from each drive in various chunk sizes. The smaller volume sizes will reduce the amount of time needed to create the volume.
When you find the optimal number for your setup, then create your 14 TB RAID set. Test again to make sure you haven't had a performance degradation.
It depends on usage scenario. Suppose that you have very little number of small files < 512KB suppose it is 10000 files and the rest is very large files >10MB 5000 files. It is better to have a 1MB cluster size when formatting the drive and if you have 8 disk raid 10 setup it will read 128KB each at the same time.
You will loose at most 508KB * 15000 disk space if all of your small files and tail-trunk of large files are 4KB which translates to ~ 7500MB which is 7.5 GB disk space loss.
You will gain 4 times more speed if compared to the jinxed setup of chunk size to 512KB and cluster size to 64KB.
So as a final recipe :
Additionaly:
! VERY IMPORTANT:
Do not use raid 5 (nearly impossible to reconstruct and buggy)
Do not use btrfs with its own raid 5 or raid 6 setup (known bugs)
Do not use raid 0 for mission critical data, it may go for 10 years however its a gamble
Use ups with software raid setups
Backup rombios software of LSI chipset cards and buy a spare card make sure they have same rombios version and make sure if you replace one other can reconstruct raid
These are some of from my 2 years of file server setup experiences