I am looking at creating a PostgreSQL database image (to run on Amazon EC2) and I have heard good things about XFS.
However, since ultimately you are running in a VM, the gain of XFS on EC2 might not be as great as on a barebone system.
Am I going to see a significant improvement with XFS on EC2?
No, I do not think so.
You will not loose anything with XFS, but you will not gain much with it as you are much likely constrained by EC2. (This is more of an opinion than a fact.)
OTOH: I use XFS everywhere I can.
I am coincidentally in the process of benchmarking storage solutions on a Windows 10 LTSB 1607
14393.1770
VM on KVMqemu-2.9.1-2
Fedora 26 x86_644.13.11-200
Long story short, a lot of it depends on your environment, but for me it was faster than than ZFS. I'm not sure what optimizations Amazon made with EC2, but it's also based RHEL, which in turn is based on Fedora, so hopefully my test environment is similar enough to be a decent comparison. RHEL/Fedora team maintains XFS and it's the default file system on their distros - I can understand why after taking these benchmarks...
Here's a good academic paper with more benchmarks comparing XFS to EXT3 and EXT4 from UC Berkeley:
https://people.eecs.berkeley.edu/~kubitron/cs262/handouts/papers/hellwig.pdf
TL;DR --- If you want a sample of the benchmarks:
Computer is a SM x9spu-f Xeon E3-1230 v2, root filesystem 1TB Seagate Enterprise drive. The storage infrastructure is two HGST he8 8TB helium-filled drives with an 800GB Intel DC S3700 that's about 60% over-provisioned beyond spec. Drives are second-gen fdisk partitioned, 4k aligned (4096 block size).
I started with OpenZFS version
5000
set tosync=always
. I love it, it's convenient and easy to use, has great features, but it seems to use a lot in the way of CPU resources. The VM felt slow to use even though it was the only one booted. I had the ARC throttled down to 4GB max, which didn't appear to negatively affect throughput during my benchmarks.Here's an example from those tests:
Then I switched to XFS
4.10.0-1
and bcache1.0.8-8
, same SSD over-provisioning,cache_mode writeback
. It really seemed to speed up quite a bit. The VMs feel really 'snappy'. RAM never went above 4G during my tests.An aside: In descending order of speed, the best format has been qcow2, compared to raw, qed and vmdk (at least, for this version of KVM).
I know containers (or any type of OS or VM, really) require seek times to be short as possible, especially when dealing with lots of other tenants and concurrent clients, but the XFS results were a bit worse - Keep in mind, even though ZFS had quicker seek times, it also had a substantial ARC (RAM) for caching that the XFS machine didn't have, so it's not an even match in that respect. If it were disabled, I'm fairly certain the XFS setup would be faster (sorry I don't have data on that yet).
Hope this helps you with your decision!
As a single data point, as of April 2020 I had 2 very similar instances running a very similar workload: t3.large, about 4 TB EBS drive, one of them using XFS and the other EXT4.
The XFS one was about 25% faster.
For completeness, the load was the Hyperledger Besu Ethereum client running a full sync, which translates to huge RocksDB databases with mainly random accesses. IOPS is the bottleneck.