I recently changed the checksum
property on one of my non-duplicated zfs filesystems to sha256
from on
(fletcher4) to better support the sending of duplicated replication steams, as in this command zfs send -DR -I _starting-snaphot_ _ending-snapshot_
.
However, the zfs manpage has this to say about send -D
:
This flag can be used regardless of the dataset’s dedup property, but performance will be much better if the filesystem uses a dedup-capable checksum (eg. sha256).
The zfs manpage also states this about the checksum
property:
Changing this property affects only newly-written data.
I have no desire to trust fletcher4. The tradeoff is that unlike SHA256, fletcher4 is not a pseudo-random hash function, and therefore cannot be trusted not to collide. It is therefore only suitable for dedup when combined with the 'verify' option, which detects and resolves hash collisions.
How can I update the filesystem's checksums, preferably without offlining the system?
To change the properties (be it compresson, deduplication or checksumming) of already written data, the zfs approach is to run the data through a
zfs send | zfs receive
sequence. Obviously, you do not need to offline the system for that, but you will needAs you already are using deduplication for the zpool, running a
zfs send | zfs receive
with the destination on the same pool as the source would only use space needed for the newly-written metadata blocks. But be prepared for the copy to take a while - dedup can be awfully slow, especially if you do not have enough RAM to hold the entire dedup table in RAM.You obvisouly would need to cease all write operations to create the final, authoritative copy of the data set, but could minimize the downtime by copying off a snapshot first, stoping all writes and doing incremental
zfs send -i | zfs receive
as the final step.