I have an LXC container. I want to route all its traffic via a different interface (tap0
) than the host.
Host interfaces:
tap0
172.13.0.3, gateway 172.13.0.1lxcbr0
192.168.12.104 with the container'sveth
as a member
In the container there is an eth0
192.168.12.105 with default route via 192.168.12.104.
I can of course ping the host from the container and vice versa.
Container routing table is trivial:
# ip route show
default via 192.168.12.104 dev eth0
192.168.12.0/24 dev eth0 proto kernel scope link src 192.168.12.105
I made a separate routing table on the host:
# ip rule add from all fwmark 1234 table 1234
# ip route show table 1234
default via 172.30.0.1 dev tap0
Host main routing table (again, nothing special):
# ip route show
default via 192.168.xxx.xxx dev eth0 proto dhcp src 192.168.xxx.xxx metric 2004 mtu 1500
172.30.0.0/16 dev tap0 proto kernel scope link src 172.30.0.3
192.168.xxx.0/24 dev eth0 proto dhcp scope link src 192.168.xxx.xxx metric 2004 mtu 1500
192.168.12.0/24 dev lxcbr0 proto kernel scope link src 192.168.12.104
I configured iptables this way:
iptables -t nat -A POSTROUTING -o tap0 -j MASQUERADE
iptables -t nat -A PREROUTING -i lxcbr0 -j MARK --set-mark 1234
Now, I try to ping 8.8.8.8 from the container and exactly every second ping is lost. Reliably.
# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=109 time=48.9 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=109 time=47.1 ms
64 bytes from 8.8.8.8: icmp_seq=5 ttl=109 time=47.0 ms
64 bytes from 8.8.8.8: icmp_seq=7 ttl=109 time=46.9 ms
64 bytes from 8.8.8.8: icmp_seq=9 ttl=109 time=47.1 ms
64 bytes from 8.8.8.8: icmp_seq=11 ttl=109 time=47.3 ms
64 bytes from 8.8.8.8: icmp_seq=13 ttl=109 time=47.1 ms
64 bytes from 8.8.8.8: icmp_seq=15 ttl=109 time=47.0 ms
Traffic on the bridge:
# tcpdump -i lxcbr0 -n
listening on lxcbr0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:04:46.208273 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 1, length 64
10:04:46.257177 IP 8.8.8.8 > 192.168.12.105: ICMP echo reply, id 138, seq 1, length 64
10:04:47.209372 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 2, length 64
10:04:48.236402 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 3, length 64
10:04:48.283429 IP 8.8.8.8 > 192.168.12.105: ICMP echo reply, id 138, seq 3, length 64
10:04:49.237599 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 4, length 64
10:04:50.252397 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 5, length 64
10:04:50.299356 IP 8.8.8.8 > 192.168.12.105: ICMP echo reply, id 138, seq 5, length 64
10:04:51.253520 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 6, length 64
10:04:52.268435 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 7, length 64
10:04:52.315270 IP 8.8.8.8 > 192.168.12.105: ICMP echo reply, id 138, seq 7, length 64
10:04:53.270429 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 8, length 64
10:04:54.284396 IP 192.168.12.105 > 8.8.8.8: ICMP echo request, id 138, seq 9, length 64
10:04:54.331473 IP 8.8.8.8 > 192.168.12.105: ICMP echo reply, id 138, seq 9, length 64
Traffic from tap0
:
# tcpdump -i tap0 -n
listening on tap0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:04:46.208342 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 1, length 64
10:04:46.257147 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 1, length 64
10:04:48.236458 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 3, length 64
10:04:48.283402 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 3, length 64
10:04:50.252446 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 5, length 64
10:04:50.299328 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 5, length 64
10:04:52.268485 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 7, length 64
10:04:52.315242 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 7, length 64
10:04:54.284445 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 9, length 64
10:04:54.331445 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 9, length 64
10:04:56.300446 IP 172.30.0.3 > 8.8.8.8: ICMP echo request, id 138, seq 11, length 64
10:04:56.347598 IP 8.8.8.8 > 172.30.0.3: ICMP echo reply, id 138, seq 11, length 64
Whenever there is an outgoing ping on tap0
there is always a response (so everything behind tap0
works okay).
It looks like the host is dropping outgoing traffic from the container. How can I debug this situation?
I solved the issue by using the
mangle
table instead of thenat
table. The the magic configuration is: