There are many discussions and websites that explain the process of setting up a Linux software RAID with mdadm
with the chunk size of a new RAID as 128kBs or 512Kbs. Serverfault is no exception.
I am now building a new media NAS and I see no good reason why I don't use a chunk size of 4kB though. Each of the four physical disks in the 'RAID-5-to-be' have 4KB sectors. Surely a 4kB chuck size makes the most sense mapping a 1:1 relationship from RAID volume down to disk sectors? Then atop that create the file system (which will be EXT4) with a 4kB block size?
How does a 128kB (for example) chunk size become more beneficial when disks only have 4kB sectors?
This is related to read-ahead. Rotating drives suffer from an extremely slow access access time, so you want to minimize access time and read sequentially as much as possible. To achieve this, Linux use a default 128KB read-ahead value, which mean that every time you request a 1KB block, 128KB would actually be read and cached.
Check your read-ahead setting with
Actually this 128KB value is extremely conservative and is better fitted for old ATA drives from ten years ago with 512 KB cache. For modern, 64 MB cache drives, a 1 or 2 MB value would probably be a better fit. For hardware RAID with large caches, values of 64MB or more are to be preferred.
Don't forget to play with the read-ahead settings to see how they affect your performance: