I have 2 servers on the same switch. I'm losing 5% of packets on ~16k pings between the two.
Below is my nasty ASCII diagram of the configuration of the network, all machines have a single interface.
a b | | -- S1 -- | S2 | S3 | c
a = Sun Netra 240
b = Dell 2950
c = my machine
S1 - S3 = 3 x Cisco Catalyst 2960G
pings from a -> b lose 5% data
pings from b -> a lose 5% data
pings from c -> a lose 0 data
pings from c -> b lose 0 data
I can't think of a reason that I'd lose packets going between ports on the same switch, when I didn't lose data coming from a different switch but still using the same port.
Can anyone throw any ideas my way please?
Thanks
Do you get any loss if you ping using the default packet size? How about if you ping using ping -l 1472? How about when pinging using ping -l 1473?
Try pinging from C to A, C to B, A to B, and B to A using ping -l 1473 -f and post the results of each of them here.
Another troubleshooting step would be to plug both machines into a different switch to see if the problem moves with the devices. My guess would be that you either have an interference problem as entens suggests, or one of those boxes is load bound and dropping packets.
NIC Driver? duplex settings? any errors showing up on the switches? What are you using to measure the loss? ping?
Also, try disabling any offloading(checksum offloading etc) on the NIC if enabled, so you can use wireshark to find out what kind of traffic you lose.
Hope that gives you some ideas.
We have encountered cases where having the swicth port and/or the NIC set to Auto speed and/or auto duplex results in loss. Changing to set speed and duplex from Auto resolved the issue.
Check the NIC\CAT Cables also is there any other network transfer traffic in the background?
it "looks" like the problem was port 0 on the nic in the sun box. we've transfered all the traffic to port 1 and the problem has vanished.
i'm not holding my breath though, this is the second time this year that this has happened. i had a bad feeling about the box when i found out that it had been end of lifed, 3 months after we bought it, had a memory failure 2 weeks before the end of the first year, and the boss won't pay for a service contract on it but prefers a case by case payment.
thanks to everyone who suggested courses of action