I'm finding the information for this quite difficult to come by, but it seems like an important feature of a RAID system: what happens immediately after a disk fails for BTRFS?
For my hardware RAID systems, I'm alerted to the fact that a disk has failed, but the system keeps running and I can hot-swap the disk without any downtime. When I tried RAID1 with BTRFS and a disk failed, the entire system crashed and I couldn't mount the system in degraded without risking making it permanently read-only. I believe this limitation is gone since Linux 4.14 but still it was anything but a slick experience.
The official BTRFS wiki does not explain what happens immediately after a complete disk failure, but there is a sentence saying you will need to mount in a degraded state. Is that not automatic?
There are SO questions about BTRFS and disk failure, but they all seem to imply you need to remount the disk after the failure event (i.e. stuff doesn't just keep running). Is it possible to keep things going while making it invisible to the services making use of the disk, like with the hardware RAIDs that I've used in the past?
Since I'm asking this question, I may as well ask: if BTRFS cannot do it, are there other options like mdadm or ZFS that do offer always-up service during disk failure?
0 Answers