OS: Ubuntu 18.04.4 LTS Kernel version: 4.15.0-76-generic Storage type: RAID10 (4 x SSD)
Question: is it a bug for this version of kernel? EXT4 run a lot slower when we perform same SQL insert test; XFS respond a lot healthier at 2K INSERT + 2K UPDATE while EXT4 only have 59 for both.
IOSTAT also showing EXT4 was at 98.4% utilization.
Another test: everything is the same, upgraded kernel to 5.6.0-050600-generic
Check the underlying block devices, they are very different. ext4 labored really hard: 377 w/s, 2.94 ms wait, 98 %util. xfs had no problems: 6277 w/s, 0.07 ms wait, 42 %util.
Milliseconds versus microseconds block device latency is difficult to explain with the file system. ext4 is not inherently 40x slower. Hundreds of IOPS and couple ms latency is as if on spindles and not SSD. Although, it could be a broken or poorly tuned SSD.
Reformat the well-performing xfs device (sda?) with ext4. Document file system creation commands and mount options. Re-run the test. That will be a more fair comparison, only changing one variable: the file system type.
I found the potential answer for this,
"cat /sys/block/{block device}/queue/scheduler" indicates the system is using "cfq", which is a problem when we have SSD/NVMe drives. Change it to "deadline" or "noop" solve this issue, detail explanation please refer to Ubuntu wiki: https://wiki.ubuntu.com/Kernel/Reference/IOSchedulers
CFQ scheduler already gone is the latest stable kernel, but this is Ubuntu 18 which is running older version of kernel, change IO scheduler by hand solve this.