I'm going to purchase a 14 bay HP 2U server with ECC memory and bunch of 3TB enterprise SATA drives. I would prefer to make 2 x RAID6 with 6 drive each. This will be used for local office storage and my Internet crawler.
I am flooded with FS/RAID choices. I am wondering should I use Hardware Raid Card, LVM, mdadm, btrfs, fuse zfs? will likely to use debain/ubuntu as OS.
My priority of concerns is:
- data security, no loss, no corruption (most important)
- reasonable performance
- minimal maintenance and easier recovery/replacement if anything goes wrong
Has anyone with production experience?
EDIT: add fuse-zfs
The best option for data security is ZFS data encryption, ans for no corruption Best data reliability is provided by ZFS as well. ZFS has its own RAID (which is software RAID), that can help you a lot in this case to build RAID/Mirror or many types, based on your needs. you can even have Spare drives to be added in a ZFS RAID , as just in case one drive fails, it is automatically replaced with spare drive. Also, in case of ZFS, RAID re-construction is fastest, as it does not re-calculate every block on drive, but intelligently only recovers the data that needs to be reconstructed. Also, in case of ZFS, RAID is much more reliable as it provides some sort of Raid5/6 (named as RaidZ/Raidz2/RaidZ3) parity does not needs to be calculated again and again (as its done in traditional RAID5/6...
For Extreme performance, just use ZFS cachings over a 100-400GB SSD Drive (based on your needs), so all of your reads/writes can be cached and you can get IOPS of more than 10,000+ (if tuned properly) as compared to 14 drive RAID, which can give you 14x read speed (appx 1500-2000 IOPS)...
Dont go for btrfs, as its slowest filesystem.
Start with your backup plan. All the raid-like technologies you mention will reduce downtime when you lose a disk, but they can't replace backups.
Short-lived snapshots (LVM, Btrfs, ZFS) give you a consistent state to back up from. zfs send (zfs->zfs), btrfs send (btrfs->btrfs), and dump (ext->any) are useful to send changes incrementally; rsync is a more portable but less efficient alternative.