Ping a Specific Port

Question

JDS

Asked: 2014-07-02 11:47:47 +0800 CST2014-07-02 11:47:47 +0800 CST 2014-07-02 11:47:47 +0800 CST

How do storage IOPS change in response to disk capacity?

772

All other things being equal, how would a storage array's IOPS performance change if one used larger disks.

For example, take an array with 10 X 100GB disks.

Measure IOPS for sequential 256kb block writes (or any IOPS metric)

Let's assume the resulting measured IOPS is 1000 IOPS.

Change the array for one with 10 X 200GB disks. Format with same RAID configuration, same block size, etc.

Would one expect the IOPS to remain the same, increase, or decrease? Would the change be roughly linear? i.e. increase by 2X or decrease by 2X (as I've increased the disk capacity by 2X)

Repeat these questions with 10 X 50GB disks.

Edit: More Context

This question resulted as a conversation among my Sysadmin team that is not well versed in all things storage. (Comfortable with many aspects of storage, but not the details of managing a SAN or whatever). We are receiving a big pile of new Netapp trays that have higher disk capacity per-disk -- double capacity -- than our existing trays. The comment came up that the IOPS of the new trays would be lower just because the disks were larger. Then a car analogy came up to explain this. Neither comment sat well with me so I wanted to run it out to The Team, i.e. Stack-Exchange-land.

The car analogy was something about two cars, with different acceleration, the same top speed, and running a quarter mile. Then change the distance to a half mile. Actually, I can't remember the exact analogy, but since I found another one on the interwebz that was similar I figured it was probably a common IOPS analogy.

In some ways, the actual answer to the question doesn't matter that much to me, as we are not using this information to evaluate a purchase. But we do need to evaluate the best way to attach the trays to an existing head, and best way to carve out aggregates and volumes.

7 Answers

Voted

ewwhite · Answer 1 · 2014-07-02T16:38:11+08:00

I know this is probably a hypothetical question... But the IT world really doesn't work that way. There are realistic constraints to consider, plus other things that can influence IOPS...

50GB and 100GB disks don't really exist anymore. Think more: 72, 146, 300, 450, 600, 900, 1200GB in enterprise disks and 500, 1000, 2000, 3000, 4000, 6000GB in nearline/midline bulk-storage media.
There's so much abstraction in modern storage; disk caching, controller caching, SSD offload, etc. that any differences would be difficult to discern.
You have different drive form factors, interfaces and rotational speeds to consider. SATA disks have a different performance profile than SAS or nearline SAS. 7,200RPM disks behave differently than 10,000RPM or 15,000RPM. And the availability of the various rotational speeds is limited to certain capacities.
Physical controller layout. SAS expanders, RAID/SAS controllers can influence IOPS, depending on disk layout, oversubscription rates, whether the connectivity is internal to the server or in an external enclosure. Large numbers of SATA disks perform poorly on expanders and during drive error conditions.
Some of this can be influenced by fragmentation, used capacity on the disk array.
Ever hear of short-stroking?
Software versus hardware RAID, prefetching, adaptive profiling...

What leads you to believe that capacity would have any impact on performance in the first place? Can you provide more context?

Edit:

If the disk type, form factor, interface and used-capacity are the same, then there should be no appreciable difference in IOPS. Let's say you were going from 300GB to 600GB enterprise SAS 10k disks. With the same spindle count, you shouldn't see any performance difference...

However, if the NetApp disk shelves you mention employ 6Gbps or 12Gbps SAS backplanes versus a legacy 3Gbps, you may see a throughput change in going to newer equipment.

dotancohen · Answer 2 · 2014-07-03T02:27:06+08:00

dotancohen

2014-07-03T02:27:06+08:002014-07-03T02:27:06+08:00

One place where there is a direct relationship between disk size and IOPS is in the Amazon AWS Cloud and other "cloudy services". Two types of AWS services (Elastic Block Store and Relational Database Service ) provide higher IOPS for larger disk sizes.

Note that this is an artificial restriction placed by Amazon on their services. There is no hardware-bound reason for this to be the case. However, I have seen devops types who are unfamiliar with unvirtualized hardware believing this restriction to be appropriate also for desktop systems and the like. The disk size / IOPS relationship is a cloud marketing restriction, not a hardware restriction.

10

Ian Macintosh · Answer 3 · 2014-07-03T00:25:51+08:00

Best Answer

Ian Macintosh

2014-07-03T00:25:51+08:002014-07-03T00:25:51+08:00

To answer your question directly - all other things being equal = no change whatsoever when GB changes.

You don't measure IOPS with GB. You use the seek time and the latency.

I could re-write it all here but these examples below do all that already and I would simply be repeating it:

https://ryanfrantz.com/posts/calculating-disk-iops.html

http://www.big-data-storage.co.uk/how-to-calculate-iops/

http://www.wmarow.com/strcalc/

http://www.thecloudcalculator.com/calculators/disk-raid-and-iops.html

8

Matthew Ife · Answer 4 · 2014-07-02T12:08:12+08:00

I should point out that IOPS are not a great measurement of speed on sequential writes, but lets just go with it.

I suspect the seek and write times of disk heads is pretty consistent despite the size of the disks. 20 years ago we we're all using 60GB disks with (roughly - certainly not linearly) the same read/write speeds.

I am making an educated guess but I dont think that the density of the disk relates linearly with the performance of the disk.

For example, take an array with 10 X 100GB disks.

Measure IOPS for sequential 256kb block writes (or any IOPS metric)

Let's assume the resulting measured IOPS is 1000 IOPS.

OK

Change the array for one with 10 X 200GB disks. Format with same RAID configuration, same block size, etc.

Would one expect the IOPS to remain the same, increase, or decrease?

Probably remain roughly equivalent to one another.

Would the change be roughly linear?

The history of spinning media tells me there is probably no relationship.

Repeat these questions with 10 X 50GB disks

Again, roughly equivalent.

Your speed, in all these cases comes from the fact that the RAID acts like one single disk with ten write heads, so you can send 1/10th of the work in parallel to each disk.

Whilst I have no hard numbers to show you, my past experience tells me that increasing your disks performance is not quite so simple as getting more capacity.

Despite what the marketing people tell you is innovation, before the start of cheap(er) solid state disks there has been little significant development in the performance of spinning media in the last 20 years, presumably theres only so much you can get out of rust and only so fast we can get our current models of disk heads to go.

Basil · Answer 5 · 2014-07-02T16:15:52+08:00

Basil

2014-07-02T16:15:52+08:002014-07-02T16:15:52+08:00

The performance added to the storage scales with each spindle added. The rotational speed of the drive is the biggest factor, so adding a 10k RPM drive will give more performance (in terms of IO/s in random IO or MB/s in streaming IO) than a 7.2k RPM drive. The size of the drive has virtually no effect.

People say small drives go faster simply because you require more spindles per usable TB. Increasing the drive size of those spindles won't decrease performance, but it will allow you to fit more data on the disks, which may result in an increased workload.

3

Sobrique · Answer 6 · 2014-07-03T02:34:46+08:00

Sobrique

2014-07-03T02:34:46+08:002014-07-03T02:34:46+08:00

If you assume all else is equal, performance characteristics of disks of larger capacity don't change very much. An 10K RPM FC drive has very similar characteristics regardless of whether it's 300GB or 3TB. The platters rotate at the same rate, and the heads seek at the same speed.

Sustained throughput likewise - not much difference. This is the root of a lot of the performance problems though, as in many cases, people buy terabytes, they don't buy IOPs or MB/sec.

And it'll take 10x as long to rebuild/copy a 3TB drive as a 300GB drive.

We've actually had to look at substantial overcapacity for storage projects as a result - drive sizes are still growing, but their performance capability isn't much. So in at least one case, we've bought ~400TB of storage to fill a 100TB requirement, because we need the spindles.

2

gnasher729 · Answer 7 · 2014-07-03T02:34:45+08:00

If you are rotating disks (not SSD) then everything else being equal, transfer speed is higher if you use the outer tracks of the disk. That would happen automatically if you use a disk that is only partially filled. At the same time, if a disk is only partially filled, your average head movement would be less, and the number of head movements would be less because there is more data per track.

That's true whether you use a single disk or a RAID drive.

Now if you are comparing 100GB and 2000GB disks, you can be sure that everything else is not equal. But if the same manufacturer offers 500GB, 1TB, 1.5TB and 2TB drives with one, two, three and four platters, then everything else is likely to be equal, and 10 x 500GB will be slower than 10 x 2TB to store 4TB of data (there will be no difference if you store 100 GB only, because the 500 GB drives will also be almost empty).

But for RAID drives, you will be not so much limited by transfer speed, but by rotational latency. So higher RPM will be more important. And you'll often find higher RPM together with lower capacity. On the other hand, if you go with high RPM/low capacity, then you might also look at SSD drives.

How do storage IOPS change in response to disk capacity?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?