Is it safe to backup data to a hard drive and then leave it for a number of years?
Assuming the file system format can still be read, is this a safe thing to do. Or is it better to continually rewrite the data (every 6 months or so) to make sure it remains valid?
Or is this a stupid question?
I wouldn't trust important backups to any single device for any significant length of time.
I've had plenty of CDs that couldn't be read after a while. (Cheap ones, admittedly, but I'm leary of the longevity claims made.)
I've had hard disks silently corrupt data.
I seem to remember I've even had SSD failures, although with a low number of writes I'd expect them to be pretty reliable.
Aside from all of these things, using a single copy means you've got no protection against physical disasters: fire etc. If you have multiple copies, you can separate them physically. Ideally I'd take some number (e.g. 3) of copies and run a checksum (I usually use MD5) periodically over everything. If one of the copies becomes corrupt in some way, if you've got multiple other copies you should be able to trust the majority, and create a new backup to replace the corrupted one. (Of course, if you keep the correct checksums in a separate place, you could trust even a single backup which still gives the right checksums, as the canonical source for replacements.)
Of course, how much trouble you go to depends on the value of the data. My personal home data is only backed up on a RAIDed NAS. My work data is in Google datacenters, which I trust fairly strongly :)
Given your other options for backup, HDD is the safest way to go. Other options include Magnetic Tape, SSD and optical media.
Let's examine the pitfalls of each:
MT: More prone to erasure when exposed to a magnetic field than a HDD. Readers are also becoming harder and harder to find. You don't want to come back in 5 years and find that there's no way to remove the data from your medium.
SSD: Reliable in that there are no moving parts. They are prone to electrical degradation after several read/write cycles which is troublesome and potentially dangerous. The likelihood of losing data while the drive is not in use is slim, however.
Optical Media: The least reliable of the bunch. They're extremely prone to physical degradation (bending/warping) and it requires very little to throw them out of their deflection spec. Further, the encoding scheme used to write data to most optical media is rather complex, creating a greater likelihood of single element failure leading to unreadability.
HDDs: Solid, sealed devices. Can be damaged by physical shock more easily than most of the above devices. Has precise mechanical parts that can lead to failed read/writes if damaged.
The benefit of HDDs, however, is that they ARE sealed. All of the moving parts are stored in an air-filtered enclosure. The magnetic stability of the bits on the disk is quite high and unlikely to change.
Further, if the mechanical parts fail, it is possible to have the platters removed and the data recovered from them directly.
There's no perfect option, but of the imperfect ones, HDD would probably be your best bet.
I would say you should recycle the media every other year or so - that is, replacing the drive, disc or tape with whatever there is to replace it with and keep more than one copy.
Few things lasts forever, optical media can degrade rapidly depending on quality, method of writing to it and environment where it's stored. Mechanical parts can always fail or there could be bugs in the firmware that is related to time or to wear and tear.
I've pondered over your question often, it would be convenient with something that is guaranteed to stay working for say 5 years. There's tapes and other form sof backup media rated for 10 or more years but I'd never trust that, at least not without a decent amount of redundancy (several copies on different batches).
Keeping the data fresh and continually recycled seems to be the reliable way to go - that way you get to test it regularly as well.
From the article
What advice do you have for long-term storage of disk drives and other media?
HDDs have actually quite high life-expectations, at least from the magnetic side (setting external magnetic fields asside). The main problem with them is, they could eventually suffer mechanically, i.e. not spin up if they are not used regularely, as some oils and coplings could become a problem.
The safest approaches to really long-time storage in my opinion are:
Optical media, especially the ones available for consumer use have unexpected low life-expectancy. You should at least check the quality of the raw data read every two years. You could have lost data in the meantime, though.
EDIT: An important aspect in this case is also that you should add checksums to the stored files (MD5, SHA1, etc.), so you'd be able to realize that some corruption occured (or not).
Do not store your hard drives for any length of time. They are designed to be on. If you don't let the HD spin up every now and then, they will go bad. I'm talking months or a year here.
They will break if not used. MTBF is "guaranteed" for drives in use, not in storage.
Hard drives are fine, but to keep the best file integrity you need to recopy the information every now and then. if you just leave the information sitting there and expect it to be perfect in 5 years, think again. you should recopy all information on HDD's every 6 months or so if they are not in regular use. its fine to use the same drive, you just need to recopy all of the information on it and replace the contents fresh.
To avoid some of the issues above, i would use a fairly new hard drive that hasn't had too many read/writes to start with. i've had great performance from lacie hd's but i also have some western digital mybooks that have held up well even though they were "budget" hard drives (the mybooks have had slightly more corruption over time than the lacie's but i'm guessing this is relative and can just be the luck of the draw from any manufacturer.
But don't quote me on this.). i'm in the process of backing up all of my digital photography onto a couple of extra externals to store in two separate places besides the two backups on my desk. One is going to my parents house 2000 miles away and the other is going to a safety deposit box. I plan to update/recopy the safety deposit box drive every few months and the drive at my parents house every 6 months to a year.
The 2 backups on my desk get updated every week or so.
I've had hdds fail while "in storage", i.e. sitting in a climate-controlled room for a few years that, when called into duty again, refused to spin up or be booted from.
So no, I wouldn't say that this is a particularly good idea. As others have said, as part of a blunderbuss strategy it is one way of keeping a copy of your data, but it probably shouldn't be your only one.
It sounds like you're not so much concerned about hardware failures, but file corruption and bit rot. In this case, ZFS is your best ally. If data preservation is your goal, consider using RAIDZ2 if you can afford it, or at least RAIDZ1. RAIDZ is comparable to RAID5, except it uses a variable stripe width to eliminate the infamous RAID5 write hole. This is especially useful with a cheap NAS, because power failures likely won't corrupt the array. The file corruption and bit rot are taken care of by real-time disk scrubbing, in which the data is constantly being checksummed to verify it's accurate. Those are just the tip of the iceberg with how ZFS is THE filesystem of choice.
If you want an easy NAS setup at home with ZFS built in, check out http://freenas.org. The latest release candidate includes ZFS, and it's not that hard to set up.
It will be interesting to see the long term results of switching to ZFS simply for data preservation... it's too new at the moment. However, the facts are all there, and it's a no-brainer: the best file system for data integrity is ZFS.
If you want your data to survive or any period of time:
Using a third party provider is another alternative. Something like Amazon S3, Mozzy or a similar service gives you an ultra-low cost way to store stuff.