From experience I learned that every hard drive will fail, it's just a matter of time.
I have learn my lesson the hard way and now I do backup.
When I bough new drive, I often segregate my drives list with the warranty period. Hard drive manufacturer are there to make money and obviously, most of the time, they designed their hard drive to last at least the warranty period. So after that period, I expect the failing rate to be greater. I already had 2 of 3 drive of a RAID 5 failed almost at the same time (second drive failed when reconstructing the array and yes I had a recent backup).
My question is: What is the best practice with preventive replacement of hard drive in a RAID after the warranty?
Do you care about it? How many drive in the array do you replace?
Notes on responses
When creating a new array: use drives from different manufacturer / batch.
When having an already old array: add a new spare.
The Google study on hard drive failure rates showed that there was less of a correlation with age than previously suspected. The best advice I have heard is to avoid creating arrays of disks from one batch or a single manufacturer. The Google study showed that there is a strong correlation between drives from a similar manufacturing batch failing simultaneously.
If you are concerned with the reliability of a RAID dataset my strong advice is to move to RAID10, or failing that RAID6.
Given the MTBF and error rates per Gb read, the chance of a double failure while rebuilding a degraded RAID5 set are far too high for comfort with the terabyte sized drives on the market today. ref http://hardware.slashdot.org/hardware/08/10/21/2126252.shtml
It depends on whether you're talking about server-class gear or desktop-class gear.
If it's a desktop machine built with your own money and off-the-shelf drives, and you're not worried about compatibility, then yes, your strategy is sound. Every X years, go out and buy all-new drives to replace your current drives. They're going to be faster, quieter, and larger. You could replace the drives individually, letting the array rebuild itself, and then when the rebuilds are complete, reconfigure your array to be larger. (Not all raid adapters support operations like this - online rebuilds and size changes.)
If it's a server-class machine like an HP Proliant or IBM System X, it gets more complicated. You may need to use hard drives on the compatibility list for your raid adapter. In that case, the drives are going to be expensive because they're probably no longer produced, or they're just plain expensive to begin with on server-class stuff anyway. Even worse, you might be buying refurb gear from your reseller and not knowing it - this isn't uncommon with server resellers.
Plus, you may be discarding drives with perfectly good lifespans and replacing them with drives that are destined for trouble. Rather than proactively replacing those, it makes more sense to build the server with a hot spare to begin with, and make sure your raid array supports automatic rebuilds using a hot spare. Then the rebuild will happen before you even get out of bed to make it into the datacenter, and you can replace the dead drive at your leisure without spending money or time.
I'll agree 100% with that same batch of drives all failing close together. I have 10 dell workstations, after 4 years, I had 6 of the drives all fail with 12 months of each other.
With production servers, I always bought from a place like Dell and made sure they would stock spares for atleast how long I planed to keep the server in operation, typically 4 years.
I've had 3 servers with RAID's have a drive fail on me. I never had hot spares, but dell got me the replacements the next day and the rebuild was done in no time. That plus proper backups and you should be fine.
You can try using raid6. It can survive 2 failing disks, be sure to always have a hot spare disk.