Related to this question about using an SSD for system and HDD for data, except I would like my system to do this automatically...
Is it possible to have several layers of storage and push items automatically between them, using preferably free, open-source software?
I know of hugely expensive enterprise-class solutions like the EMC SAN->EMC Centera automatic archiving, but I was wondering if this sort of staged storage is possible to do automatically.
It would be nice to be able to have several layers in this : Memory->SSD->HDD->slower HDD or tape or some other archive solution.
Are there any filesystems which can do this automatically? (ZFS, Btrfs, HAMMER?)
Any Unix-variants are fine, as I'm interested in how this works and whether it's likely to be portable to Linux or other flavours (BSD etc).
Well, ZFS uses a storage layering called Hybrid Storage Pool (HSP):
With HSP its easy to automatically benefit from the advantages of SSDs compared to a harddisk-only solution. A system using HSP can be both faster and cheaper than the latter. See this link for some nice examples and more details.
I think there are plans regarding hierarchical storage management (HSM) for ZFS (see for example the Automatic Data Migration (ADM) OpenSolaris project) but I don't know its current status.
Check out the LVM based "lvmts" (LVM Tiered Storage) solution this guy is cooking up:
https://bbs.archlinux.org/viewtopic.php?pid=1140640#p1140640
Pretty cool.
TIER seems to answer your needs It is a linux kernel module that can create a tiered storage. It seems to learn by itself the pattern and optimize the placement into the storage
http://www.lessfs.com/wordpress/?p=776
SAM-QFS is Sun's existing product and was open-sourced last year. It's CDDL, so you could only directly port it to *BSD.
Answering my own question with something I just found:
I was just updating the kernel and looking at the new stuff that has been added, and there is now a 'CACHEFILES' option which allows for caching (usually remote) filesystems to a local filesystem. I guess I could use this to cache a slower storage mechanism (HDD) to a faster one (SSD), at least for one level of hierarchy.
Some relevant links. YMMV.
http://code.google.com/p/fscops/ -- "Online Hierarchical Storage Manager (OHSM) is the first attempt towards an enterprise level open source data storage manager which automatically moves data between high-cost and low-cost storage media.".
http://www.tack.ch/unix/dmapi/ -- XFS + DMAPI under Linux
http://jfs.sourceforge.net/ -- JFS + DMAPI under Linux
RAID controllers offer some of these features.
"Leverages SSDs in front of HDD volumes to create high-capacity, high-performance controller cache pools"
http://www.lsi.com/channel/products/storagesw/Pages/MegaRAIDCacheCadeSoftware2-0.aspx
LVM2 snapshots come to mind... but you can't really do more than a single snapshot.
btier (7 years old)
autofs (recent, file based)