I have accidentally added a mismatched raidz1 to an existing pool. Yes, I specified the '-f' to do it, but I was expecting to use the '-f' for a different reason and wasn't paying attention.
Anyhoo... how bad is it? I really just needed extra space in the pool and wanted that space to be redundant. The pool looks like this:
NAME STATE READ WRITE CKSUM
pool_02c ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
c0t5000C500B4AA5681d0 ONLINE 0 0 0
c0t5000C500B4AA6A51d0 ONLINE 0 0 0
c0t5000C500B4AABF20d0 ONLINE 0 0 0
c0t5000C500B4AAA933d0 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
c0t5000C500B0889E5Bd0 ONLINE 0 0 0
c0t5000C500B0BCFB13d0 ONLINE 0 0 0
c0t5000C500B09F0C54d0 ONLINE 0 0 0
I read one other question about this kind of scenario and it stated that "both performance and space efficiency (ie: via un-optimal padding) will be affected", but that's a little vague and I'm hoping someone can give a bit more detail.
From a pool use standpoint, isn't data put on the disks in vdev raidz1-0 redundant in that vdev and data put into raidz1-1 redundant within that vdev? And, if that's the case, wouldn't performance be related to the specific vdev?
Where does padding come into play here, and how would that affect storage capacity? ie. Would it cause more space to be allocated like for every 1M I write, I use up 1.2M?
I'm not overly concerned about performance for this pool, but how does this configuration affect read/write speeds? I would expect each vdev to perform at the speed of it's respective devices, so how does a vdev replication difference affect this?
As and FYI, this is on a Solaris 11.4 system. I have tried to remove the vdev using:
zpool remove pool_02c raidz1-1
but I get the error:
cannot remove device(s): not enough space to migrate data
Which seems odd since I literally just added it and haven't written anything to the pool.
I'm ok living with it since it seems to have given me the space I expected, but just want to better understand the devil I'll be living with.
Short answer: while slightly sub-optimal, your pool layout is not lacking in a major way - it is a legitimate configuration.
Long answer: RAIDZ vdev are atypical in how they store data compared to a traditional RAID array. In a RAIDZ vdev some space can be lost due to a) padding and b) dynamic stripe width. I suggest you reading (multiple times, if needed) this very useful article. Anyway, the two takeaways are:
allocation happens at
ashift * (redundancy+1)
granularity (ie: for anashift=12
RAIDZ vdev, allocation granularity will be 4K*(1+1) blocks. Allocating 1, 3 or 5 blocks is not possible; 2, 4 or 6 will be allocated instead. This as an interesting interaction withrecordsize
: as an example, for the extreme case of all 4K block writes, RAIDZ1 has the same space efficiency of mirror vdevs (ie: 1 data + 1 parity block)stripe size is dynamic, meaning that for a single
recordsize
data block will be at least, and possibly many (depending on vdev with), parity block. This means that the intersection between maximum dynamic stripe size (recordsize / ashift
), filesize and vdev width will cause a varying amount of different parity blocks to be writtenDue to point #2 above, unequal vdev width has can lead to different space efficiency for the two vdevs.
However you should avoid putting too much weight into this kind of calculations: enabling compression will change effective physical allocation size in a manner that almost always overcome the optimal space efficiency points as described by the rules above.
So if you had 3 (and only 3) disks to add to a RAIDZ vdev, you did the right thing. That said, be careful when using
-f
About vdev removal, I can not comment as I don't know how Solaris manages that.