Check_MK has sent me an email as follows:
***** Nagios *****
Notification Type: PROBLEM
Service: Interface 5
Host: foo
Address: x.y.z.t
State: CRITICAL
Date/Time: Fri May 3 10:02:40 ICT 2013
Additional Info: CRIT - [tunl0] (up) speed unknown, in: 3.39MB/s, out: 0.00B/s, out-errors: 100.00%(!!) = 0.1
Running ifconfig
, I got:
tunl0 Link encap:IPIP Tunnel HWaddr
inet addr:x.y.z.t Mask:255.255.255.255
UP RUNNING NOARP MTU:1480 Metric:1
RX packets:92101704629 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:652 dropped:0 overruns:0 carrier:0
collisions:652 txqueuelen:0
RX bytes:18941091817671 (17.2 TiB) TX bytes:0 (0.0 b)
Pay attention to the errors and collisions. I know that a nonzero value of the collisions field indicates the possibility of network congestion. But:
- What may be the exact cause? How can I troubleshoot?
- Is there any similar to
ethtool
for IPIP Tunnel interface?
modinfo ipip
filename: /lib/modules/2.6.18-194.17.1.el5/kernel/net/ipv4/ipip.ko
license: GPL
srcversion: 288C625C7521D577F7AD9E4
depends: tunnel4
vermagic: 2.6.18-194.17.1.el5 SMP mod_unload gcc-4.1
module_sig: 883f3504ca37590565662cff69dd0be11277ff0a08d3a3...
ip tunnel show
tunl0: ip/ip remote any local any ttl inherit nopmtudisc
UPDATE at Mon May 6 10:05:01 ICT 2013
@Danila Ladner: Searching through Google, I found this link has same opinion with you:
My tunnel does not work:
ifconfig tunl<n>
reports errors and collisionsDid you use
ifconfig
, perhapsifconfig ... pointopoint ...
to set up your tunnel?Shut it down; delete it; start again with
ip
.
But could you please elaborate further?
@Sergey Vlasov:
tunl0 Link encap:IPIP Tunnel HWaddr
inet addr:x.y.z.t Mask:255.255.255.255
UP RUNNING NOARP MTU:1480 Metric:1
RX packets:81621711099 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:692 dropped:0 overruns:0 carrier:0
collisions:692 txqueuelen:0
RX bytes:16915649263419 (15.3 TiB) TX bytes:120 (120.0 b)
I don't understand why there are 2 transmitted packets from tunl0
interface? I'm going to setup an event handler to run tcpdump
whenever collisions
counter is increased. Let's wait to see what happens.
UPDATE at Tue May 7 14:05:39 ICT 2013
@Danila Ladner: To exclude the possibility, I have tried your suggestion:
ifdown tun0
modprobe -r ipip
modprobe ipip
ip addr add dev tunl0 x.y.z.t/32 brd x.y.z.t
ip link set tunl0 up
I'm waiting to see if the problem is resolved:
tunl0 Link encap:IPIP Tunnel HWaddr
inet addr:x.y.z.t Mask:255.255.255.255
UP RUNNING NOARP MTU:1480 Metric:1
RX packets:19630041 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:4083271398 (3.8 GiB) TX bytes:0 (0.0 b)
The
collisions
counter for anipip
tunnel interface is increased in two cases:If the next hop of the encapsulated packet is the same tunnel interface: ipip.c line 437.
If the path MTU of the next hop for the encapsulated packet is less than 68: ipip.c line 447.
Both of these cases can usually happen only if the encapsulated traffic loops back into the same tunnel (the first case is a direct looping, the second case happens when the path MTU is reduced down to zero due to some more complicated looping which was not immediately detected by the first condition). One possible cause is that the normal route for encapsulated packets was temporarily down, and the next best route for these packets happened to be the tunnel itself, resulting in a loop.
However, in the LVS-TUN case nothing should have been sent to the tunnel at all (the tunnel interface in this case is receive-only), unless some misguided software added unneeded routes through
tunl0
.As quanta noted I suggested him to take the tunnel down if it was built with
ifconfig
and rebuild it withip
. As I had a similar issue on Centos 5 kernel 2.6.25 a few years back, In my case it resolved the issue, But I was also consulting net guys and devs in IRC why that was an issue as I needed that route on production box and needed to schedule a downtime to nuke it. I do not remember exactly and as of now do not have any hard proof but Kuznetsov (original big contributor to the kernel source on the matter suggested to rebuild it withip
as he has seen issues withifconfig
. I hope this helps quanta to resolve his issue.OFF TOPIC: So, the bottom line is i am quite dumb myself using a lot of
ifconfig
and it is hard to switch toip
, as long as I continue dealing with old Solaris 8 boxes and bsd boxes.