I'm building multiple storage nodes for Riak database. Each node contains 20-40 x 2TB drives. Riak replicates each save to 3 nodes so there I have redundancy. Now the question is that what's the best and most efficient way to create one "virtual harddrive" per node without actually risking losing data if drive collapses (RAID 0) or doing unnecessary replication (RAID 1+)? I'm using Ubuntu Server.
Originally I was thinking about using ZFS but I'm open for suggestions.
I would struggle to suggest any file system other than ZFS for any purpose. Unless your using it under linux, then i'd timidly suggest you use LVM
Well if you have redundancy across three nodes you should be able to use RAID 0 surely? That way if a disk fails you can rebuild the node once you have a new disk in place and copy the data from another node. If you need redundancy at the node level I guess your only option would be RAID 10, or possibly RAID 5 so long as your database isn't too intensive.