First: I'm perfectly okay with accepting that this is the case for now and am not looking for an immediate solution, rather I'm trying to understand the technical limitation for this constraint.
I'm working primarily with ZFS on Linux, but my understanding is that all FOSS ZFS development is rooted in OpenZFS by now, so information of any/all of its variants is appreciated.
The man page of zfs remove
states:
Top-level vdevs can only be removed if the primary pool storage does not contain a top-level raidz vdev, all top-level vdevs have the same sector size, and the keys for all encrypted datasets are loaded.
I understand and/or can guess the reasons for most of these restrictions, but I don't really understand why the mere presence of a raidz vdev prevents removal of any (even a mirrored or non-redundant) vdevs.
It was my understanding/assumption that from the pool perspective each vdev acts as a "dumb block device" with the actual redundancy/mirroring happening on the vdev level (as suggested by the repeated warning that there is no redundancy at the pool level: all redundancy must exist at the vdev level and a single vdev going bad takes the whole pool down).
Under that assumption it shouldn't matter what specific data vdev is removed, let alone the presence of a "bad" (raidz) vdev in the pool.
Clearly that assumption (or some other one that I can't think of) is wrong. Can someone enlighten me on what?
The only guess I have left that I haven't been able to verify is that there is no absolute reason why raidz vdevs would prevent vdev removal, but that some interaction of some raidz-specific operation and device removal is simply not implemented/tested/verified at this point.
Data inside a
RAIDZ
device are striped differently than on a single or mirror vdev. Removing a lone single (or mirror) vdev really means to create an hiddenindirect
device which contains a table remapping (redirecting) the old DVA address to a new one, but this require metadata layout to be the same between the removed device and the new one. This simply is not the case when the data are copied to aRAIDZ
device.