I need to setup an in memory storage system for around 10 GB of data, consisting of many 100 kb single files(images). There will be lots of reads and fairly periodic writes(adding new files, deleting some old ones).
Now, I know that tmpfs behaves like a regular file system for which you can, for example, check free/used space with df, which is a nice feature to have. However I'm interested if ramfs would offer some advantages with regards to speed of IO operations.
I know that I can not control the size of consumed memory when using ramfs and that my system can hang if it completely consumes the free RAM, but that will not be an issue in this scenario.
To sum it up, I'm interested:
- Performance wise, which is faster: ramfs or tmpfs(and possibly why)?
- When does tmpfs use swap space? Does it move already saved data to swap(to free RAM for other programs currently running) or only new data if at that moment there is no free RAM left?
My recommendation:
Measure and observe real-life activity under normal conditions.
Those files are unlikely to be ALL be needed and served from cache at all times. But there's a nice tool called vmtouch that can tell you what's in cache at a given moment. You can also use it to lock certain directories or files into cache. So see what things look like after some regular use. Using tmpfs and ramfs are not necessary for this situation.
See: http://hoytech.com/vmtouch/
I think you'll be surprised to see that the most active files will probably be resident in cache already.
As far as tmpfs versus ramfs, there's no appreciable performance difference. There are operational differences. A real-life use case is Oracle, where ramfs was used to allow Oracle to manage data in RAM without the risk of it being swapped. tmpfs data can be swapped-out under memory pressure. There are also differences in resizing and modifying settings on the fly.
Don't over-think this. Put enough RAM in your system and let the kernel's disk cache take care of things for you. That way you get the benefit of reads coming directly from memory, while still being able to persist data on disk.
1) Performance benchmark.
Using this page as a reference, I did I/O comparison between tmpfs and ramfs, and the results are that it is pretty much identical in terms of performance:
2) According to this page, tmpfs uses swap, and ramfs does not use swap.
If you have a sufficient amount of RAM installed to host the various kernel buffers, the applications stack and heaps, the regular file system cache and all the files you intent to put in it,
ramfs
should never be slower thantmpfs
as there will be no risk of physical I/O by design. Physical I/Os are undoubtedly the main cause of performance degradation in that area.However, if you have not that amount of RAM installed, using
ramfs
might and probably will be slower thantmpfs
as the latter is using the virtual memory heuristic to decide what should better be on disk (i.e. in the swap area) vs what should be on RAM while withtmpfs
, your file system data is stuck on RAM which might be a waste of resource.To answer you second question, yes,
tmpfs
will move old data first to the swap area, not the last "hot" one.