So we have some dell blades and chassis (blades are M600's, chassis M1000's) and other systems (R710 with MD3000 array). The R710 exports a source tree via nfs for the blades to build and test with.
The problem is the blades loose the nfs mounts. Blades in the same chassis, with what seem like identical configurations, have their connections hang, they cannot even ping the server. They eventually come back.
It is mostly Dell, in fact, we have a cable running from the r710 to a switch in one of the chassis, and another to a switch and from there to the chassis, both can have issues.
We are running Centos5 or Fedora Core release 5 (Bordeaux). The nfs server is running CentOS release 5.4 (Final).
Any thoughts? troubleshooting tips?
These are all to the same host, but via different routes:
Through a switch
[root@b053 ~]# ping svnwatch-data
PING storage.rack1.rinera.int (10.1.1.54) 56(84) bytes of data.
--- storage.rack1.rinera.int ping statistics ---
9 packets transmitted, 0 received, 100% packet loss, time 7999ms
Routed through another host:
[root@b053 ~]# ping svnwatch-data2
PING storage2.rack1.rinera.int (172.16.100.25) 56(84) bytes of data.
64 bytes from 172.16.100.25: icmp_seq=1 ttl=64 time=0.260 ms
64 bytes from 172.16.100.25: icmp_seq=2 ttl=64 time=0.217 ms
64 bytes from 172.16.100.25: icmp_seq=3 ttl=64 time=0.201 ms
64 bytes from 172.16.100.25: icmp_seq=4 ttl=64 time=0.264 ms
--- storage2.rack1.rinera.int ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2999ms
rtt min/avg/max/mdev = 0.201/0.235/0.264/0.031 ms
With the host connected to a different chassis's switch (they are daisy chained)
[root@b053 ~]# ping svnwatch-data-eth2
PING svnwatch-data-eth2.rack1.rinera.int (10.1.1.56) 56(84) bytes of data.
64 bytes from 10.1.1.56: icmp_seq=1 ttl=64 time=0.598 ms
64 bytes from 10.1.1.56: icmp_seq=2 ttl=64 time=0.096 ms
64 bytes from 10.1.1.56: icmp_seq=3 ttl=64 time=0.168 ms
--- svnwatch-data-eth2.rack1.rinera.int ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.096/0.287/0.598/0.222 ms
[root@b053 ~]#
Here is what I would check.