We have a Windows 2008 file server that's going off-line intermittently. When it's down, the Windows 2003 web servers pile up requests with pending file operations.
I ran some tests using ColdFusion, and I noticed that if you request a file on a known-down or non-existent server, the initial request takes 15 seconds to timeout. Subsequent requests fail quickly for the next 10 seconds or so. Then there is another 15 second timeout, and the pattern repeats.
I would like to configure both the maximum amount of time a request to a non-existent server can take (the 15 seconds), as well as how long the fact that the server is down is cached (the 10 seconds).
Is this something can can be tuned on Windows clients?
Edit: I got a capture from Wireshark showing Netbios naming service packets:
No. Time Source Destination Protocol Info
90 2.184614 172.27.8.7 172.27.8.255 NBNS Name query NB CHASE-IE<20>
97 2.920946 172.27.8.7 172.27.8.255 NBNS Name query NB CHASE-IE<20>
106 3.671325 172.27.8.7 172.27.8.255 NBNS Name query NB CHASE-IE<20>
136 12.936379 172.27.8.7 10.0.2.15 NBNS Name query NBSTAT *<00><00><00><00><00><00><00><00><00><00><00><00><00><00><00>
140 14.436181 172.27.8.7 10.0.2.15 NBNS Name query NBSTAT *<00><00><00><00><00><00><00><00><00><00><00><00><00><00><00>
142 15.936134 172.27.8.7 10.0.2.15 NBNS Name query NBSTAT *<00><00><00><00><00><00><00><00><00><00><00><00><00><00><00>
You can see the 15 seconds the initial request is taking. It looks like it does a UDP broadcast to the whole subnet (172.27.8.255). It doesn't get an answer, and then somehow gets the right IP (10.0.2.15), perhaps via DNS. Then it spends a few seconds timing out to that server (it's offline).
The timout can come from different sources. First you should use somthing like TcpView to determine on which port most time is wasted.
I was able to reduce the initial waiting period from 15 seconds to 2 seconds by putting the server into lmhosts.