When I look at disk (block device) storage options from various cloud hosting providers, I usually see numbers such as :
- Google Cloud (Zonal SSD) : 15.000 - 100.000 read IOPS
- OVH Cloud : (High Speed / SSD ) : Up to 3.000 IOPS
- AWS : (io1 / SSD) : Up to 64.000 IOPS
I do not know anything about the underlying technology.
Even if these cloud providers would use some of the slower SSD options available (regular consumer SATA SSDs), some of these disks comes with IOPS specifications for reads and writes in the range of 90.0000 and up (looking at 860 EVO SSD 2.5). An NVMe SSD would give far better throughput. Even if these cloud providers would stack these SSD disks into some sort of storage cluster, I'd still be surprised to see that the IOPS would fall from 90.000 to 3.000.
I have the feeling that these numbers are not comparable, even though the same metric (IOPS) are used.
How should I interpret the disk IOPS listed by cloud providers vs. the disk IOPS listed by disk manufacturers?
Google does specify 900.000 to 2.700.000 IOPS for a local SSD. That shows their hardware is perfectly capable. The "zonal SSD" has a much lower IOPS, but that is a disk which is accessible by all servers in the particular zone. That means it's remote to the server where your code is running, and there's software between your server and the SSD to manage concurrent access.
Yes, that's costing a lot of IOPS. That's not unexpected. Just look at the huge difference between the local NVMe SSD (2.700.000 IOPS) and the non-NVMe (900.000 IOPS). You already lose 66% of the raw performance just by introducing a single slow bus between the flash chips and the CPU. That's probably a few centimeters of SATA cable and the SATA chips on both sides of that cable. Raw SSD speeds are so blisteringly high that any overhead will be huge.
Intel even considered NVMe to be too slow for their Optane storage product, and went for DIMM, just like RAM. That makes sense; Intel's CPU's can do several billion memory transfers per second. (not million, it's really three orders of magnitude more). However, Optane appears to be failing in that respect: it's stuck well below a million IOPS and the DIMM interface seems ludicrous overkill. But the direction is clear; even NVMe might soon become too slow for local storage. The recipe for speed is direct access without overhead. The figures you quote just show how badly performance can drop when you add overhead.
Quotas. Multi-tenancy. Counting host IOPS after redundancy. Scalability limits with their (probably IP based) storage stack. Selling a premium faster SSD disk. Actually being honest and conservative with what is practical. The list of possible reasons is long.
Should one disk be too limiting, you can attach several and use them all on one host, say with LVM. A bit strange to have to size SSDs for IOPS rather than capacity, but perhaps that is the constraints of these disk types.
If you wish to run your own storage array, do that. Of course, that means you cannot use the managed storage of say AWS or GCP.
Whatever your storage options are, you should test with something resembling your workload. Realistic load if you can, synthetic IOs with
fio
ordiskspd
if you have to.Especially if you actually need to push 100k IOPS. That level of load still is a serious exercise of a storage stack.