We have a system that's used for a GIS database (with Postgres as the underlying engine) which is using a software RAID 5 array of 4x2TB Samsung EVO870 SATA SSDs as its database drive. There is a nightly backup script that dumps the tables to a local temporary directory, GZips them, and transfers them to a separate machine (with mv
). Normally the backup starts at 1830 and runs until 0500; yes, it's a big backup. A month or so ago, the external system fell off line, and so the mv
step stopped working, and the temporary storage area filled up with unmoved files. After the external system was repaired, we noticed that the temp area was full and deleted everything out of it - about 3.5TB of files. About two weeks ago, we noticed that the daily backup was not completing until 1000. My suspicion is that things have slowed down because the temp directory, though erased, is not being purged, so when we have to write a new temp file as part of the backup, we have to clean SSD blocks before we can rewrite them.
fstrim -av
does not print anything, which suggests that no filesystems are saying they have support for DISCARD.
This system does have LVM on top of the RAID array. The database and temp directories are in an ext4 filesystem (was ext2, but stuff happened) in its own LV that is mounted at /db
; fstrim -v /db
reports File system does not support DISCARD
.
OS version: Debian Linux 8 (jessie), Linux 3.16.0-4-amd64 x86_64
RAID information:
root@local-database:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sda1[7] sdd1[4] sdc1[5] sdb1[6]
5860147200 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
bitmap: 1/2 pages [4KB], 524288KB chunk
root@local-database:~# mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Sun Dec 27 17:55:35 2015
Raid Level : raid5
Array Size : 5860147200 (5588.67 GiB 6000.79 GB)
Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Tue Aug 8 14:07:27 2023
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : local-database:0 (local to host local-database)
UUID : 18d38d9a:daaa0652:8e43a020:133e5a4f
Events : 53431
Number Major Minor RaidDevice State
7 8 1 0 active sync /dev/sda1
6 8 17 1 active sync /dev/sdb1
5 8 33 2 active sync /dev/sdc1
4 8 49 3 active sync /dev/sdd1
Information about the specific LV used for the database and temp areas:
--- Logical volume ---
LV Path /dev/MainDisk/postgres
LV Name postgres
VG Name MainDisk
LV UUID TpKgGe-oHKS-Y341-029v-jkir-lJn8-jo8xmZ
LV Write Access read/write
LV Creation host, time local-database, 2015-12-27 18:04:04 -0800
LV Status available
# open 1
LV Size 4.78 TiB
Current LE 1251942
Segments 4
Allocation inherit
Read ahead sectors auto
- currently set to 6144
Block device 253:2
PV information:
root@local-database:~# pvdisplay
--- Physical volume ---
PV Name /dev/md0
VG Name MainDisk
PV Size 5.46 TiB / not usable 2.50 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 1430699
Free PE 121538
Allocated PE 1309161
PV UUID N3tcTa-LBw2-D8gI-6Jg4-9v3T-KWn2-5CDVzK
I would really like to get the backup times back down to 11 hours, so that we're no longer colliding with actual work times. Is there something in the TRIM options that I can do here, or is there something else I've missed? I have checked that the database did not suddenly grow any new tables, or grow 50% overnight; there are no network connection issues, there was nothing odd that happened to the network or the external server just before we started taking 16 hours to back up as far as I can see. Is there anything else I'm missing?
Edit due to comments: The actual SSDs are only a year and a half old, replacing the original 250GB SSDs in April 2022. (Ran out of space, and the RAID array, LV, and filesystem were expanded in place.) We're using software RAID, bone-standard Linux with mdadm
.
Edit in response to comments:
root@local-database:~# lsblk -d
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.8T 0 disk
sdb 8:16 0 1.8T 0 disk
sdc 8:32 0 1.8T 0 disk
sdd 8:48 0 1.8T 0 disk
root@local-database:~# cat /sys/module/raid456/parameters/devices_handle_discard_safely
N
root@local-database:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 21
Model: 2
Model name: AMD FX(tm)-8320 Eight-Core Processor
Stepping: 0
CPU MHz: 1400.000
CPU max MHz: 3500.0000
CPU min MHz: 1400.0000
BogoMIPS: 7023.19
Virtualization: AMD-V
L1d cache: 16K
L1i cache: 64K
L2 cache: 2048K
L3 cache: 8192K
NUMA node0 CPU(s): 0-7
According to an article linked by Nikita Kyprianov in the comments below, Samsung EVO 870s have serious trouble with AMD hardware, which this clearly is. So that would seem to be that. I guess we'll just have to live with it...
You need to enable discard support in /etc/lvm.conf (issue_discards=1)
I can't remember if this needs to set in md but there's no mention in my local man pages.