Background
When a hard-drive controller detects an error, and needs to remap a sector, the drive normally becomes unresponsive for the seconds (or possibly minutes) it takes to try to complete the re-mapping.
With the drive no longer responding, a host RAID controller can assume that the drive has failed, and mark it as unreliable.
Some hard-drive models, from some manufacturers, have a feature to limit (in seconds) how long the drive will spend trying to remap a sector. Different drive manufacturers give different names to this feature:
- Time-Limited Error Recovery (TLER): Western Digital
- Error Recovery Control (ERC): Seagate
- Command Completion Time Limit (CCTL): Samsung, Hitachi
Note: The correct term from the ATA/ATAPI command set is Command Completion Time Limit (CCTL)
By limiting the time the drive spends trying to recover a sector, it ensures that the host RAID controller will not think that the drive has failed.
Different RAID controllers (hardware and software) have different timeout intervals. If the drive is unresponsive for longer than their timeout it will be marked as offline, e.g.:
- 3ware 9650SE: 20 s
- FreeBSD 6.3 (
kern.geom.mirror.timeout
): 4 s
On to my question
Is there an option in Windows that controls how long Windows will wait before it decides a drive is not responding?
i do know of a registry setting called TimeoutValue
:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk\TimeOutValue
- TimeoutValue
- Location: HKLM\System\CurrentControlSet\Services\Disk\TimeoutValue
- Values: 1 - 255 seconds
- Meaning: Time in units of seconds before an SRB request initiated by the disk class driver will time out. If this registry value is not set, a default value of 10 seconds is used. Time-out values for requests that are initiated by class drivers vary according to the class driver.
- Operating system version: This feature is available in all versions of the Windows operating systems.
But this is only documented as applying to the SCSI Miniport Driver. And even if it also applies to my SATA drives, it doesn't guarantee that it also applies to Window's RAID-5 subsystem.
The reason i ask about adjusting the timeout in my (software) controller is because hard-drive manufacturers have started to get mean-spirited:
- no longer including the ability to limit the error-recovery time
- they intentionally lock one feature of the ATA/ATAPI command set behind a pay-wall.
For that firmware feature they want you to buy the more expensive ("RAID Edition") drives (e.g. 71% more expensive). Thus changing:
- Rᴇᴅᴜɴᴅᴀɴᴛ Aʀʀᴀʏ ᴏꜰ Iɴᴇxᴘᴇɴsɪᴠᴇ Dɪsᴋs1 (
RAID
), to - Rᴇᴅᴜɴᴅᴀɴᴛ Aʀʀᴀʏ ᴏꜰ Exᴘᴇɴsɪᴠᴇ Dɪsᴋs
Bonus Reading
See also
- Green drives dropping from RAID, TLER/ERC problems?
- MSDN: Registry Entries for SCSI Miniport Drivers
- StorageReview.com:How to use "desktop" drives in RAID without TLER/ERC/CCTL
- StorageReview.com:TLER / CCTL Disks which can have their TLER / CCTL values changed
- LSI 15639, User Guide for 9650SE 9690SA from 9.5.2 Complete Codeset
- T13 AT Attachment 8 - ATA/ATAPI Command Set (ATA8-ACS) (pdf)
- Western Digital - Difference between Desktop edition and RAID (Enterprise) edition drives
1 Yes, inexpensive. Source: "A Case for Redundant Arrays of Inexpensive Disks (RAID)", D.A. Patterson, G. Gibson, and R.H. Katz, ACM SIGMOD Conference, June 1988, Chicago IL.
This doesn't really answer your question per se, but as a recommendation software RAID controllers, in my experience, are far less reliable than hardware controllers. If you have the budget for them always opt for a stand-alone card to take the burden of disk IO off of the computer itself.