Summary of My Need
We put a large amount of files on a filesystem for analysis at a later time. We can't control how many files we're going to have, and this one box needs access to all of them.
Unchangeable Limitations
- I can't change the inode limit. It's ext4, and it's the default 4 billionish
- There will always be a lot of files. The question isn't how to reduce the number of files; it's how to circumvent the 4Bn inode limit.
- I can't use network storage. This box lives in a data center and due to the staggering amount of existing data throughput, network storage is not an option.
My Ideas
- I could mount a file as a loopback device in the location where we're placing these files.
- Pro: Simple to implement
- Con: Another layer of complexity, but a pretty thin one.
- XFS. No inode limit.
- Pro: This obviously just erases the problem.
- Con: Not sure how much flexibility I'll have in making this change to a production system.
My Question
What are some other stragies for circumventing this hard limitation? Are there other benefits/drawbacks to the approaches I've mentioned?
I would suggest that you use a network server with a filesystem designed to handle what you need. The first thing that comes to mind would be something that supports zfs (freenas and nexenta, though the free version of the latter has some limitations) or if you can afford it you can buy something like netapp.
I am less familiar with UFS available on freebsd etc., but heard that would work too.
Info we're missing...
ext4 doesn't sound like it does what you want. So don't use it...
XFS should handle your situation well. It's in the Linux kernel, but deployment depends heavily on your disk flexibility. There are also some XFS tunables that will help with your load... Out of the box, it won't run particularly well. You'll want the right filesystem creation and mount options. The rest depends on your distro, whether you're using a controller, and your particular workload.
I guess you answered your own question, the XFS option seems to be the best one (i guess you even get a performance boost). The more complex part should be, how to convert EXT3/4 into XFS?
If your storage is not a unique physical RAID VD (and you didn't make the fs directy on the BlockDevice - mkfs.ext4 /dev/sdb) then I might suggest you into partitioning your fs tree into smaller blocks and mount them accordingly, configuring your software to write to both locations simultaneosly, and splitting the writes if possible. Ex.
If splitting the writes is not possible from the application, you could create a cron which moves the files from the ext4 partition into the new XFS every n minutes
Additional options:
Create filesystems on-demand as you need them. Partition with LVM instead of attaching your filesystems directly to the MBR. You can mount a FS anywhere in your tree, so you can add a new filesystem whenever and wherever you need it. Also, LVM can span portions of multiple disks if you want, meaning physical medium boundaries are less meaningful.
Loopback FSes are not a terrible idea, but really why wouldn't you use LVM instead? All the advantages, none of the disadvantages.
If you're just archiving files (i.e not random-access) then saving them directly into a .tar.gz file is not a bad idea. I've also seen systems where files get "staged" temporarily to an SSD while the structure is getting built, then get dumped to a tar.gz on a spinning drive for long-term storage.
XFS is not a bad option, though it has its own difficulties as well. It's not quite as forgiving of unclean shutdowns, for example. Not that you would expect data loss, but it does sometimes require more hands-on intervention.
Of all these, automatically pushing files into .tar.gz archives is my favorite. It saves you space and inodes, and is just plain tidy. Large numbers of small files make filsystems much less performant than expected.