We are having a very strange problem, that I'm hoping someone has seen before.
Our site is made up of html files with ASP code inside them. We have IIS setup to run the html files through the asp engine with no problem.
The problem we are having is that some users when they browse the site, the server takes forever to do the initial response. Up to about 25-35 seconds. If we telnet to the public IP on port 80 from a machine with the problem and to "GET /" it takes quite a while to respond. Now the catch here is that if I telnet from the same machine to the web server's private IP behind the load balancer I can browse the site just fine.
The load balancers are Cisco ACE's and have been working fine for a couple of months now.
To make it stranger, the other two web clusters are not having the problem.
To test I created a test.html which just has a couple of lines in it, and that can be pulled just fine via Telnet, or web browser.
We've reduced the cluster down to a single machine for troubleshooting with no success. The production machines are Windows 2008 Web Edition. I've even setup a Windows 2003 Standard server and moved the site to that machine without helping the problem.
To make things even move confusing not all client machines behind the same VIP have the problem. In the case of our office all network requests go through a single public IP, but about 30% of the office machines are having this problem, while the rest can browse the site just fine.
Other sites on this cluster are having the problem as well, but we are looking at just the one to try and figure out the problem.
I've checked the disk queues on the web servers and can't see a problem.
Turns out there was a problem with our Cisco load balancer and it needed to be rebooted.