I'm forced to have this directory structure /var/www/$WEBSITE/$DIR1/$DIR2/$FILES
for each of those $FILES, there is approx 50,000 XHTML pages.
I'm running Cherokee, which has new front-end caching support. But i'm somewhat memory limited, so I can't cache the whole thing. I believe I can cache just the listing, which is the worst part.
What can I do on the file system side of things? I normally use ext4 (my server is using ext3), But i know ReiserFS is preferred for this type of situation. I could possibly just mount that $WEBSITE in ReiserFS. I really am not looking forward to repartitioning things, and would love to work around this.
Can i do a staggered subdirectories somewhere on the filesystem and just symlink them all to $DIR2? Would that help make this nasty situation perform better, with less pain from the ext3?
i really don't want anything RDB, I would consider a NOSQL option If I could somehow create a faux filesystem from it. that would be such a cool option, just not sure it even exists. Possibly something FUSE related exists?
the whole site already exists, and it basically is just a fancy directory listing. The files get written once, and then just read from there on out. There is no chance the number of files per directory would increase from this point.
50,000 files should not be enough to cause a significant speed issue on Linux. You mention caching the listing, so I'm thinking you are doing some kinda of processing on the files instead of plain serving. I would look for issues on how you process the files.
I recommend XFS with one possible exception: if you often need to remove lots of files from that directory tree, delete performance is not stellar in XFS. This has been improved a bit with the new delaylog mount parameter, though.
Other than that XFS won't even cough with 50 000 files in a directory.
You can try XFS. I have large directories running on the XFS filesystem with good results.
ls
,du
and other file operations are noticeably better than on ext3. Either way, for scalability, it may make sense to develop a cleaner directory structure.I found a solution to my problem
My FS performance was making me uncomfortable at a mere ~5000 files, which is why I posted this question. I normally would use Ext4, and have used XFS; which has always been a solid-performer; but i already had everything installed on Ext3.
Ext4 has Htree indexes enabled by default, which would make this a non-issue. Ext3 has support for Htree indexes, dir_index; however, it was not enabled on my FS.
I did have to fsck after I rebooted, but otherwise it enabled successfully. When I listed files in those directories, the performance issues were gone. I could avoid implementing a NoSQL based VFS, gridfs-fuse; and I could avoid a resize/repartition on my fully allocated HD.
As for changing my FS, I wanted to avoid that kind of disk operation if at all possible.