I'm planning to (privately) deploy a server that will be hammered with random I/Os in files ranging from 100MB to 50GB. The requests will range from 128 KB to 4MB. The profile will be 50:50 concerning read and write with a tendency of a little more reads.
What Filesystem can handle this load best? I've for now opted for XFS. But what tuneables should I look into?
Thanks
The requirements and constraints:
Unknowns that would help:
Sequential I/O
If the 50:50 ratio is represented by reading and writing whole files, and pretty big files at that, then your access patterns are more sequential than random as far as a filesystem is concerned. Use an extent-based filesystem to increase sequentiality in your filesystem for best performance. Since the files are so large, read-ahead will provide significant performance boosts if supported by hardware (some RAID controllers provide this).
Random I/O
This changes if you're planning on doing the read/write activities simultaneously, at which point it does become significantly random. The same applies if you're holding a large number of files open and reading/writing small portions within those files as if it were a database.
One of the biggest misconceptions I run into is the idea that a defragged filesystem performs better than a fragmented one when handling highly random I/O. This is only true in filesystems where the metadata operations suffer greatly on a fragmented filesystem. For very high levels of fragmentation extent-based filesystems can actually suffer more performance degradation than other styles of block management.
That said, this problem only becomes apparent when the I/O access patterns and rate are pushing the disks to their maximum capabilities. With 14TB in the filesystem that means between 7 and 50 spindles in the actual storage array, which yields a vast range of capabilities; ranging from 630 I/O Ops for 7x 2TB 7.2K RPM drives to 9000 I/O Ops for 50x 300GB 15K RPM drives. The 7.2K RPM RAID array will hit I/O saturation a lot faster than the 15K RPM RAID array would.
If your I/O operations rate is not pushing your storage limits, the choice of file-system should be based more on overall management flexibility than tweaking the last few percentage points of performance.
However, if your I/O actually IS running your storage flat out, that's when the tweaking starts becoming needed.
XFS:
-l logdev=/dev/sdc3
EXT4:
-E stride
set to the number of blocks (either 512b or 4K depending on the drive) on a single disk-stripe in the RAID.-E stripe-width
set as 'swidth' in XFS-O journal_dev /dev/sdc3/
I think the real problem here is not just the filesystem, but the parameters settting you use with the filesystem. One thing that might affect is likely read ahead size.
But, OK let's just talk about name. Besides XFS, I think ext4 will suit your need. Bottom line is, I think you need extent based filesystem to avoid fragmentation as much as possible. Both XFS and ext4 support delayed write IIRC, so both might help you to increase chance to do write merge too.
regards,
Mulyadi.
Given the scale of data you have, I think you want to look at a network cluster filesystem, like Lustre, or IBM's proprietary GPFS. These are designed to give high-performance results under demanding workloads like yours.