Ping a Specific Port

Question

leto

Asked: 2011-03-05 07:22:30 +0800 CST2011-03-05 07:22:30 +0800 CST 2011-03-05 07:22:30 +0800 CST

Filesystem/Options for intensive random I/O

772

I'm planning to (privately) deploy a server that will be hammered with random I/Os in files ranging from 100MB to 50GB. The requests will range from 128 KB to 4MB. The profile will be 50:50 concerning read and write with a tendency of a little more reads.

What Filesystem can handle this load best? I've for now opted for XFS. But what tuneables should I look into?

Thanks

3 Answers

Voted

sysadmin1138 · Answer 1 · 2011-03-05T10:21:06+08:00

The requirements and constraints:

50:50 read:write ratio
Files being written will range from way larger than the block size to vastly larger than the block size.
Individual requests will range from 128KB to 4MB
On Linux
The file-system will be pretty large, at 14TB.

Unknowns that would help:

Whether or not the random I/O is within files, or is purely based on whole files being read and written in 128KB-4MB chunks
The frequency of file updates.
Concurrency: The frequency of parallel read/write operations (I/O ops).

Sequential I/O

If the 50:50 ratio is represented by reading and writing whole files, and pretty big files at that, then your access patterns are more sequential than random as far as a filesystem is concerned. Use an extent-based filesystem to increase sequentiality in your filesystem for best performance. Since the files are so large, read-ahead will provide significant performance boosts if supported by hardware (some RAID controllers provide this).

Random I/O

This changes if you're planning on doing the read/write activities simultaneously, at which point it does become significantly random. The same applies if you're holding a large number of files open and reading/writing small portions within those files as if it were a database.

One of the biggest misconceptions I run into is the idea that a defragged filesystem performs better than a fragmented one when handling highly random I/O. This is only true in filesystems where the metadata operations suffer greatly on a fragmented filesystem. For very high levels of fragmentation extent-based filesystems can actually suffer more performance degradation than other styles of block management.

That said, this problem only becomes apparent when the I/O access patterns and rate are pushing the disks to their maximum capabilities. With 14TB in the filesystem that means between 7 and 50 spindles in the actual storage array, which yields a vast range of capabilities; ranging from 630 I/O Ops for 7x 2TB 7.2K RPM drives to 9000 I/O Ops for 50x 300GB 15K RPM drives. The 7.2K RPM RAID array will hit I/O saturation a lot faster than the 15K RPM RAID array would.

If your I/O operations rate is not pushing your storage limits, the choice of file-system should be based more on overall management flexibility than tweaking the last few percentage points of performance.

However, if your I/O actually IS running your storage flat out, that's when the tweaking starts becoming needed.

XFS:

Mount: Set 'allocsize' to no larger than 65536 (64MB), but do set it high. This improves metadata speed for file accesses.
Mount: Set 'sunit' to the stripe-size of your RAID array. Can also be set at format time.
Mount: Set 'swidth' to the number of drives in your RAID array (or N-1 for R5, N-2 for R6). Can also be set at format time.
Format: If you really need that last percentage point, put the filesystem log on a completely separate storage device -l logdev=/dev/sdc3

EXT4:

Format: -E stride set to the number of blocks (either 512b or 4K depending on the drive) on a single disk-stripe in the RAID.
Format: -E stripe-width set as 'swidth' in XFS
Format: As with XFS the last percentage point of performance can be squeezed out by placing the journal on a completely separate storage device. -O journal_dev /dev/sdc3/

user66421 · Answer 2 · 2011-03-05T09:16:47+08:00

user66421

2011-03-05T09:16:47+08:002011-03-05T09:16:47+08:00

I think the real problem here is not just the filesystem, but the parameters settting you use with the filesystem. One thing that might affect is likely read ahead size.

But, OK let's just talk about name. Besides XFS, I think ext4 will suit your need. Bottom line is, I think you need extent based filesystem to avoid fragmentation as much as possible. Both XFS and ext4 support delayed write IIRC, so both might help you to increase chance to do write merge too.

regards,

Mulyadi.

0

mattdm · Answer 3 · 2011-03-05T09:25:04+08:00

mattdm

2011-03-05T09:25:04+08:002011-03-05T09:25:04+08:00

Given the scale of data you have, I think you want to look at a network cluster filesystem, like Lustre, or IBM's proprietary GPFS. These are designed to give high-performance results under demanding workloads like yours.

0

Filesystem/Options for intensive random I/O

Sequential I/O

Random I/O

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?