I have been seeing some strange connection issue in the production environment.
The setup has two IBM Http Server's (IHS) and a network IP load-balancer in front of them (round-robin).
One instance the system is working fine, the next requests stop arriving at the IHS. Telnet directly to port 80 of the IHS is established sucessfully, but connection to the port 80 through the IP of the load-balancer fails!
The puzzle comes next, the network admins say the load-balancer is working fine. When we finally reboot the IHS servers and request start flowing...
The situation happened three times the last month and no obvious pattern was found.
Any debug ideas?
You'd better sniff the traffic from the client, then you can detect the lag completely. Or sniff the client and server at the same time.
Either ARP issues or DHCP (perhaps a rogue DHCP server on the network..? some sort of self-assigning IP addresses?).
The load balancer might be fine but there might be something wrong between it and the http server. Three times a months means obviously a timeout issue (renew DHCP lease?).