Problem
I want to enable background TRIM operations on a swap partition within a SSD disk on Linux. According to several articles, e.g. this one, the kernel detects this configuration and automatically performs discard operations, but on my tests seems that it’s not working although “discard” mount option is used to force this behavior.
Scenario
- Debian Wheezy running Linux 3.2.0
- SSD disk: 1 x 120GB OCZ Vertex 3 MI
- 2GB swap “plain” partition, w/o other layers (LVM, RAID, etc.)
Background
These are the steps I follow to check if background TRIM is working on the swap partition:
TRIM support: check if the SSD disk supports TRIM commands and the kernel flags the device as non-rotational:
# hdparm -I /dev/sda | grep TRIM * Data Set Management TRIM supported (limit 1 block) * Deterministic read data after TRIM # cat /sys/block/sda/queue/rotational 0
Swap fill-up: mount the partition, clean all VM caches and configure Linux to swap aggressively setting vm.swappiness to 100. Then, run a script that allocates all the available memory and forces the kernel to start swapping:
# swapon [--discard] /dev/sda2 # echo 3 > /proc/sys/vm/drop_caches # echo 100 > /proc/sys/vm/swappiness # ./fill-up-memory.up
The script runs a on server with 32GB of physical memory + 2GB swap partition and creates a ~33.8GB object in memory, that’s enough to fill-up all the memory and start swapping. This is an example of a script that achieves this behavior:
#!/usr/bin/python mem = 33.8 testing = 'A' * int(1024 * 1024 * 1024 * mem) raw_input()
Check swap content: “swapon -s” shows that 100% of swap memory is used. Using “hdparm --read-sector” I check the raw-content of the swap partition sectors and all bytes are set to “4141”, the corresponding hexadecimal notation for the “A” character, everything works as expected. This is an example script to read sector-by-sector the content of the swap partition:
#!/bin/bash for sector in `seq 194560 4100095` ; do hdparm --read-sector $sector /dev/sda done
NOTE: you can get the start/end sector of the swap partition using parted, cfdisk, etc.
When I stop the script it releases all the memory including the swap allocations, “swapon -s” returns no swap usage in the system. At this point, it’s expected that Linux starts discarding the content of the swap partition in background, but it doesn’t work, the content of the sectors is still “4141”, even several hours later.
I have made several tests and seems that Linux only performs a full discard when the partition is enabled using swapon()
system call, but never in background, although “discard” mount options is enabled on /etc/fstab.
Further research: blkdev_issue_discard() is the kernel function in charge of sending TRIM commands to underlying SSD devices, there are two unique references to this function on mm/swapfile.c
:
discard_swap()
it’s called during swapon() process, if “discard” mount option is enabled it discards all the content, this works as expected.discard_swap_cluster()
it should discard the content of a cluster swap, but seems that it never performs a TRIM command.
Question: what is the expected behavior of Linux on swap + SSD devices? It should discard all free sectors/pages or only issue an initial full-discard when the partition is enabled during the boot up process? Thanks.
It seems that discard_swap_cluster is only called from scan_swap_map which in turn is called from get_swap_page or get_swap_page_of_type. So if I'm correct, the discarding only happens when a new swap page is going to be allocated, not when a page is freed.
It could be that your system has
--discard=once
as default. Have you tried mounting with a specific discard option?and forcing like this:
# swapon --discard=pages /dev/sda2
You could also try to make a
fstrim
service, or configure it if it's already available.The contents of swap are effectively 'discarded" when
swapon -s
returns "no swap used". The system is not going to overwrite the contents of the blocks (filled w/ "4141") because it's an SSD and excessive writes would shorten the life of the SSD. (At least, that's what I take away from the documentation)