We're considering building a ~16TB storage server. At the moment, we're considering both ZFS and XFS as filesystem. What are the advantages, disadvantages? What do we have to look for? Is there a third, better option?
We're considering building a ~16TB storage server. At the moment, we're considering both ZFS and XFS as filesystem. What are the advantages, disadvantages? What do we have to look for? Is there a third, better option?
ZFS will give you advantages beyond software RAID. The command structure is very thoughtfully laid out, and intuitive. It's also got compression, snapshots, cloning, filesystem send/receive, and cache devices (those fancy new SSD drives) to speed up indexing meta-data.
Compression:
It supports simple to create copy-on-write snapshots that can be live-mounted:
Filesystem cloning:
Filesystem send/receive:
Incremental send/receive:
Caching devices:
This is all just the tip of the iceberg, I would highly recommend getting your hands on an install of Open Solaris and trying this out.
http://www.opensolaris.org/os/TryOpenSolaris/
Edit: This is very old, Open Solaris has been discontinued, the best way to use ZFS is probably on Linux, or FreeBSD.
Full disclosure: I used to be a Sun storage architect, but I haven't worked for them in over a year, I'm just excited about this product.
I've found XFS more well suited to extremely large filesystems with possibly many large files. I've had a functioning 3.6TB XFS filesystem for over 2 years now with no problems. Definately works better than ext3, etc at that size (especially when dealing with many large files and lots of I/O).
What you get with ZFS is device pooling, striping and other advanced features built into the filesystem itself. I can't speak to specifics (I'll let others comment), but from what I can tell, you'd want to use Solaris to get the most benefit here. It's also unclear to me how much ZFS helps if you're already using hardware RAID (as I am).
using lvm snapshots and xfs on live filesystems is a recipe for disaster especially when using very large filesystems.
I've been running exclusively on LVM2 and xfs for the last 6 years on my servers (at home even since zfs-fuse is just plain too slow)...
However, I can no longer count the different failure modes I encountered when using snapshots. I've stopped using them altogether - it's just too dangerous.
The only exception I'll make now is my own personal mailserver/webserver backup, where I'll do overnight backups using an ephemeral snapshot, that is always equal the size of the source fs, and gets deleted right afterwards.
Most important aspects to keep in mind:
A couple additional things to think about.
If a drive dies in a hardware RAID array regardless of the filesystem that's on top of it all the blocks on the device have to be rebuilt. Even the ones that didn't hold any data. ZFS on the other hand is the volume manager, the filesystem, and manages data redundancy and striping. So it can intelligently rebuild only the blocks that contained data. This results in faster rebuild times other than when the volume is 100% full.
ZFS has background scrubbing which makes sure that your data stays consistent on disk and repairs any issues it finds before it results in data loss.
ZFS file systems are always in a consistent state so there is no need for fsck.
ZFS also offers more flexibility and features with it's snapshots and clones compared to the snapshots offered by LVM.
Having run large storage pools for large format video production on a Linux, LVM, XFS stack. My experience has been that it's easy to fall into micro-managing your storage. This can result in large amounts of unused allocated space and time/issues with managing your Logical Volumes. This may not be a big deal if you have a full time storage administrator who's job is to micro-manage the storage. But I've found that ZFS's pool storage approach removes these management issues.
ZFS is absolutely amazing. I am using it as my home file server for a 5 x 1 TB HD file server, and am also using it in production with almost 32 TB of hard drive space. It is fast, easy to use and contains some of the best protection against data corruption.
We are using OpenSolaris on this server in particular because we wanted to have access to newer features and because it provided the new package management system and way of upgrading.
I dont think you should focus on performance. Is your data safe with XFS, ext4, etc? No. Read these PhD thesis and research papers:
XFS is not safe against data corruption: pages.cs.wisc.edu/~vshree/xfs.pdf
And neither is ext3, JFS, ReiserFS, etc: zdnet.com/blog/storage/how-microsoft-puts-your-data-at-risk/169?p=169&tag=mantle_skin%3bcontent "I came across the fascinating PhD thesis of Vijayan Prabhakaran, IRON File Systems which analyzes how five commodity journaling file systems - NTFS, ext3, ReiserFS, JFS and XFS - handle storage problems.
In a nutshell he found that the all the file systems have
But ZFS successfully protects your data. Here is an research paper on this: zdnet.com/blog/storage/zfs-data-integrity-tested/811
Which OS are you planning on running? Or is that another part of the consideration? If you're running Solaris, XFS isn't even an option as far as I know. If you're not running Solaris, how are you planning on using ZFS? Support is limited on other platforms.
If you're talking about a Linux server, I'd stick with Ext3 personally, if only because it receives the most amount of testing. zfs-fuse is still very young. Also, I had troubles with XFS once, when a bug caused data corruption after a kernel update. The advantages of XFS over Ext3 definitely didn't outweigh the costs involved in restoring the machine, which was located in a remote datacenter.
Not a FS-oriented answer sorry but be aware that a number of disk controllers won't deal with >2TB LUNS/logical-disks - this can limit the way you organise your storage quite a bit. I just wanted you to be aware so you can check your system end-to-end to ensure it'll deal with 16TB throughout.
It depends what features you want..., the two reasonable choices are xfs and zfs as you have said, the xfs code is pretty well tested I first used it 8 years ago under IRIX
It is possible to get snapshots from xfs ( using lvm and xfs_freeze )
It is possible to have a separate log device eg SSD
Large xfs traditionally need lots of memory to check
The issue with zeros turning up was a "security" feature, Which I think disappeared a while ago.
Well guys, lets not forget about latest addition to zfs: deduplication. And lets speak about on the fly iscsi, nfs or smb sharing. As others already said, exports of zfs file systems, snapshots, raidz(=raid5) block checksum, dynamic stripe width, cache managing and many others. I vote for zfs.