I have an application that uses a lot of space as essentially cache data. The more cache available the better the application performs. We're talking hundreds to thousands of TB. The application can regenerate the data on-the-fly if blocks go bad, so my primary goal is to maximize the size available on my filesystem for cache data, and intensely minimize the filesystem overhead.
I'm willing to sacrifice all reliability and flexibility as well as "general-purposeness" requirements. On top of that, I know exactly how many files of cache data I will have on any given volume because the application writes cache files with a fixed size (~100GB in my case). I'd like to be able to overwrite a file with a new one if a block goes bad occasionally, so it might be good to have a few spare inodes lying around, but it is also feasible to reformat the entire volume if needed. The files are all stored 1 directory deep in the filesystem. Directory names can be capped at a single letter, for example, and I don't need the directory either (all files could just as well be stored top-level on the root of the volume). File names are all a fixed size (hash plus timestamp). Once the cache data is written, the files will only ever be read and the volume can be mounted read-only. The cache is valid for a long time (years). The integrity of the cache is also validated by the application, so I don't need any filesystem integrity features like checksums and journling, etc.
So, given that I know the exact, fixed, file size and have no reliability concerns, what filesystem should I use and how should I go about tuning it to eliminate as much overhead as possible?
0 Answers