I've noticed since few days ago that same repeating kind of messages occurs and I positively can say that nothing was intentionally changed (installed/uninstalled) in that period.
here's sample of /var/log/kern.log message:
Mar 30 06:32:45 aurora kernel: [566322.867110] e1000e: eth0 NIC Link is Down
Mar 30 06:32:47 aurora kernel: [566325.313634] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Mar 30 06:32:59 aurora kernel: [566337.632930] e1000e: eth0 NIC Link is Down
Mar 30 06:33:18 aurora kernel: [566356.543664] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Mar 30 11:05:47 aurora kernel: [582689.779752] e1000e: eth0 NIC Link is Down
Mar 30 11:05:50 aurora kernel: [582692.174337] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
from complete log file - when take all log message this kind into count - I can conclude:
- eth0 fails every few hours
- eth0 fails in first case for two and in second for 19 seconds
It's production server I'm talking about here.
How to solve this problem, since mail server is in production and network failures of 19 seconds duration I cannot tolerate?
ifconfig
. If non-zero then there are problems with hardware (cable, NIC card, or hub/switch). An unreliable Ethernet cable will give errors in this field too.ethtool
and make sure the network settings (duplex, etc) match those on the switch. If you are not the admin of the switch, then ask the network admin to provide you with the settings.As a side note, you should assess whether you need flow control. According to HP, it is only necessary for high-performance applications: see HP article on When to Use Flow Control
Here's my fix. This problem happens on specific hardware (on one machine only 1 out of 2 ports on the NIC), always with the e1000e driver, since kernel 3.9 or so. This file is for centos7, goes in
/etc/init.d/
and has to be enabled withchkconfig --add <name>
. The interface name is hardcoded...be sure to set it.