I'm using a Hurricane Electric tunnel on one of my VPSs, and it isn't working completely. I set up the tunnel with a script basically identical to this one: http://www.cybermilitia.net/2013/07/22/ipv6-tunnel-on-openvz/, with only modifications for my particular setup. I can ping the server and fetch a webpage from it, but I get the following output from ping6:
root@unixshell:~# ping6 -c4 2001:470:1f0e:12a7::2
PING 2001:470:1f0e:12a7::2(2001:470:1f0e:12a7::2) 56 data bytes
From 2002:d8da:e02a::1 icmp_seq=1 Destination unreachable: Address unreachable
64 bytes from 2001:470:1f0e:12a7::2: icmp_seq=1 ttl=63 time=96.4 ms
64 bytes from 2001:470:1f0e:12a7::2: icmp_seq=2 ttl=63 time=73.2 ms
From 2002:d8da:e02a::1 icmp_seq=2 Destination unreachable: Address unreachable
--- 2001:470:1f0e:12a7::2 ping statistics ---
2 packets transmitted, 2 received, +2 errors, 0% packet loss, time 1005ms
rtt min/avg/max/mdev = 73.256/84.838/96.420/11.582 ms
on the problem server, running tcpdump, I see this simultaneously with the above:
root@tektonic:~# tcpdump -n not port 22
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on venet0, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
21:25:44.000024 IP 216.218.224.42 > 207.210.83.205: IP6 2002:cfd2:4a7c::1 > 2001:470:1f0e:12a7::2: ICMP6, echo request, seq 1, length 64
21:25:44.000094 IP 207.210.83.205 > 216.218.224.42: ICMP 207.210.83.205 protocol 41 port 0 unreachable, length 132
21:25:44.000629 IP 207.210.83.205 > 216.218.224.42: IP6 2001:470:1f0e:12a7::2 > 2002:cfd2:4a7c::1: ICMP6, echo reply, seq 1, length 64
21:25:45.020972 IP 216.218.224.42 > 207.210.83.205: IP6 2002:cfd2:4a7c::1 > 2001:470:1f0e:12a7::2: ICMP6, echo request, seq 2, length 64
21:25:45.021059 IP 207.210.83.205 > 216.218.224.42: ICMP 207.210.83.205 protocol 41 port 0 unreachable, length 132
21:25:45.021260 IP 207.210.83.205 > 216.218.224.42: IP6 2001:470:1f0e:12a7::2 > 2002:cfd2:4a7c::1: ICMP6, echo reply, seq 2, length 64
^C
6 packets captured
6 packets received by filter
0 packets dropped by kernel
And here's the relevant part of my iptables configuration:
root@tektonic:~# iptables --list | egrep '41|ipv6'
ACCEPT ipv6 -- anywhere anywhere
ACCEPT ipv6 -- anywhere anywhere
I realize I can just stop sending ICMP unreachable messages using iptables, as mentioned here: Disable ICMP Unreachable replies, however that is a suboptimal solution. Any ideas on how to troubleshoot the actual problem without having to dig through the kernel sources?
The "port 0" part of the error message is a red herring. A 6in4 packet is simply an ipv6 packet with an ipv4 header prepended, and thus has no port number at the ipv4 level. However, the ICMP packets being sent have a type number of 3 and code number 3, meaning "port unreachable", not code 2, "protocol unreachable". Here is one:
12:15:51.011697 IP 207.210.83.205 > 216.218.224.42: ICMP 207.210.83.205 protocol 41 port 0 unreachable, length 132
0x0000: 45c0 0098 8d6c 0000 4001 0f94 cfd2 53cd [email protected].
0x0010: d8da e02a 0303 62fb 0000 0000 4500 007c ...*..b.....E..|
0x0020: 9157 4000 f829 145c d8da e02a cfd2 53cd .W@..).\...*..S.
0x0030: 6000 0000 0040 3a3b 2002 cfd2 4a7c 0000 `....@:;....J|..
0x0040: 0000 0000 0000 0001 2001 0470 1f0e 12a7 ...........p....
0x0050: 0000 0000 0000 0002 8000 aa6e 1022 0002 ...........n."..
0x0060: ad8a c355 ce94 0a00 0809 0a0b 0c0d 0e0f ...U............
0x0070: 1011 1213 1415 1617 1819 1a1b 1c1d 1e1f ................
0x0080: 2021 2223 2425 2627 2829 2a2b 2c2d 2e2f .!"#$%&'()*+,-./
0x0090: 3031 3233 3435 3637 01234567
[update 2015-08-06] Upgraded tb_userspace to revision 18, no change.
[update 2015-08-09] tb_userspace.c line 163: sockv6 = socket(AF_INET, SOCK_RAW, IPPROTO_IPV6);
, and lsof -c tb_userspace
shows the socket is indeed created: tb_usersp 6614 root 4u raw 0t0 1559059549 CD53D2CF:0029->00000000:0000 st=07
[update 2015-08-09 17:18 PDT] confirmed same problem exists on plain kernel without openvz:
jcomeau@unixshell:~$ ping6 2001:470:66:79d::2
PING 2001:470:66:79d::2(2001:470:66:79d::2) 56 data bytes
64 bytes from 2001:470:66:79d::2: icmp_seq=1 ttl=60 time=86.5 ms
From 2001:470:0:206::2 icmp_seq=1 Destination unreachable: Address unreachable
64 bytes from 2001:470:66:79d::2: icmp_seq=2 ttl=60 time=83.4 ms
From 2001:470:0:206::2 icmp_seq=2 Destination unreachable: Address unreachable
64 bytes from 2001:470:66:79d::2: icmp_seq=3 ttl=60 time=86.1 ms
From 2001:470:0:206::2 icmp_seq=3 Destination unreachable: Address unreachable
^C
--- 2001:470:66:79d::2 ping statistics ---
3 packets transmitted, 3 received, +3 errors, 0% packet loss, time 2012ms
rtt min/avg/max/mdev = 83.429/85.376/86.556/1.427 ms
jcomeau@unixshell:~$ logout
Connection to www closed.
jcomeau@aspire:~$ uname -a
Linux aspire 3.2.0-4-amd64 #1 SMP Debian 3.2.54-2 x86_64 GNU/Linux
also flushed iptables and ip6tables and removed all netfilter modules. same symptom.
[update 2015-08-11 01:04] from http://linux.die.net/man/7/raw found out that after a raw packet has been dispatched to any raw sockets, the kernel will still pass it to any modules registered for that protocol. the module on my own netbook, which is the "raw kernel without openvz" I was testing on, was tunnel4. once I removed it, the destination unreachable messages stopped. I'm assuming that same module is built into the monolithic kernel on my VPS. /proc/kallsyms does not exist on it, so I will have to contact customer support.
[update 2015-08-11 01:50] http://www.haifux.org/lectures/217/netLec5.pdf is a resource that helped also.
as noted in the updates to the question, the problem is that after the kernel passes the packet to whatever raw sockets are listening on that protocol, it then hands it off to any kernel modules registered for that same protocol. since I had been using a sit tunnel on my netbook, the tunnel4 module was still loaded even though I had temporarily set up the tb_userspace tunnel for testing; so since it was registered, but no handlers were configured, it rejected the packets with the ICMP 3:3 message.
rmmod sit
followed byrmmod tunnel4
solved that problem.on the original problem server, it wasn't so easy since it's an openvz VPS with a monolithic kernel as seen by the client "boxes". but armed with the information from http://linux.die.net/man/7/raw and http://www.haifux.org/lectures/217/netLec5.pdf I was able to work with the provider to solve the problem. in this case, they re-installed the sit module so I didn't have to use the tb_userspace tunnel software at all. but I suspect the problem was that tunnel4 was installed there as well.
It looks like tektonic does not have the address 2001:470:1f0e:12a7::2 assigned to its venet0 interface. It is receiving the packets and rejecting them even though they are well-formed.
Your next step should be to verify that tektonic can establish TCP connections to IPv6-only hosts such as ipv6.google.com, and that the packets indeed travel to the configured Hurricane Electric relay host via IPv4 encapsulation. If TCP gets through but ICMP does not, then it is definitely an endpoint filtering problem (i.e. firewall rules).
The ICMP errors are send by the kernel because no socket exists to receive protocol 41 packets with that particular combination of source and destination IP address.
If a process creates a raw socket with protocol 41, then the kernel will stop producing the ICMP errors. Such a socket by default will receive packets from all source IP send to any destination IP assigned to the local machine. Using the bind and/or connect system call, it is possible for the application to restrict which combination of source and destination IP address it will be receiving. Packets which do not match any such socket will still be producing ICMP errors.
It is clear that in your case the packets are simultaneously received by the tunnel and produce ICMP errors. But according to my description above, it is impossible for a packet which is received by a socket to also produce the ICMP error message. But there are other ways the packet could be received, which doesn't stop the kernel from producing ICMP errors.
It is possible for a socket to receive packets at a lower protocol layer where all packets will be visible regardless of IP protocol number. If the tunnel software you are using uses such a low level socket to receive protocol 41 packets, then ICMP errors will be produced like described in your question.
If this is what the tunnel software does, then I would consider it to be a design flaw in the tunnel software. In that case you have three options: