Ping a Specific Port

Question

chris

Asked: 2011-05-12 15:30:24 +0800 CST2011-05-12 15:30:24 +0800 CST 2011-05-12 15:30:24 +0800 CST

Acceptable number of ethernet errors in a medium sized environment?

772

I'm implementing a monitoring system for an existing modest sized data center deployment.

So far I've only gotten to the host / application side of the monitoring equation but I'm noticing what I consider to be an alarming number of Ethernet errors on various hosts. To me, alarming is 3 or 4 per day per host (some have none). When I look at the SNMP counters for the switches, I again see lots of errors on the counters but I'm not graphing those errors (yet).

In my prior environments with many more ports my error rate was approximately zero except for those hosts that had actual problems like duplex mismatches.

None of these interfaces are saturated; they're pushing approximately 40-50 megabytes / sec over gig links.

My gut feeling is that there shouldn't be any errors at all over any interface if everything is working properly but I'm worried that if I pick a fight over resolving these problems I'll just alienate everyone else who believes "it works fine; it's been working for years that way".

Anyone have some good stories / studies / statistics for when to be alarmed at ethernet errors? Or something to indicate how a small volume of errors would affect, say, an iSCSI volume?

Thanks!

1 Answers

Voted

Hyppy · Answer 1 · 2011-05-12T15:40:05+08:00

Hyppy

2011-05-12T15:40:05+08:002011-05-12T15:40:05+08:00

TCP/IP can handle errors quite well. A single error will be retransmitted and everything will generally be hunky-dory.

Consistent numbers of 3-4 errors per day is alarming because it indicates a possible problem (bad cable, port, etc), but in itself it is not an itch worth scratching. A single error could be the result of anything from electromagnetic interference to a very ill-positioned subatomic event. In both cases, the impact on your network is negligible.

If it will become a political issue, just leave it be (but keep an eye on it). I'd only throw a fit if I started seeing errors happening MUCH more often, or at least as a higher percentage of total packets. 0.1% may be a good threshold, but it's all a matter of how armored the neck you will be sticking out is.

1

Acceptable number of ethernet errors in a medium sized environment?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?