I am trying to set up PXE booting (which requires TFTP) on one of my networking that is hiding behind a NAT router.
My question is similar to many others around the 'Net, but all the answers I found applied to CentOS 7 with iptables. I need to do this with CentOS 8 with firewalld and nft as the backend.
Unable to NAT TFTP traffic because iptables is not forwarding the return connection to the client despite TFTP helper creating an expectation https://unix.stackexchange.com/questions/579508/iptables-rules-to-forward-tftp-via-nat
Here is my simplified network diagram:
Outside NAT Inside NAT
10.0.10.10 10.0.10.11->192.168.1.1 192.168.1.2
TFTP server --------> NAT ---------> PXE/TFTP client
TFTP is not working. With tcpdump, I see that the RRQ message travels successfully from 192.168.1.2 to 10.0.10.10. The response arrives at the router, but is not properly NATed to arrive at the client.
I tried it with both settings for sysctl net.netfilter.nf_contrack_helper (rebooted after changing the setting):
# sysctl -a | grep conntrack_helper
net.netfilter.nf_conntrack_helper = 0
With nf_contrack_helper=0:
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
Initial RRQ:
14:02:27.842563 IP (tos 0x0, ttl 64, id 64642, offset 0, flags [DF], proto UDP (17), length 54)
192.168.1.2.36799 > 10.0.10.10.69: [udp sum ok] 26 RRQ "grub2/grubx64.efi" octet
Initial RRQ after NAT:
14:02:27.842619 IP (tos 0x0, ttl 63, id 64642, offset 0, flags [DF], proto UDP (17), length 54)
10.0.10.11.36799 > 10.0.10.10.69: [udp sum ok] 26 RRQ "grub2/grubx64.efi" octet
Response from TFTP server to NAT router:
14:02:27.857924 IP (tos 0x0, ttl 63, id 60000, offset 0, flags [none], proto UDP (17), length 544)
10.0.10.10.60702 > 10.0.10.11.36799: [udp sum ok] UDP, length 516
(repeated several times until timeout)
With nf_contrack_helper=1, the outgoing packet is not even NATed at all:
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
Initial RRQ:
14:02:27.842563 IP (tos 0x0, ttl 64, id 64642, offset 0, flags [DF], proto UDP (17), length 54)
192.168.1.2.36799 > 10.0.10.10.69: [udp sum ok] 26 RRQ "grub2/grubx64.efi" octet
(repeated several times until timeout)
The nf_*_tftp helpers are both loaded (regardless of the nf_contrack_helper setting):
# lsmod | grep tftp
nf_nat_tftp 16384 0
nf_conntrack_tftp 16384 3 nf_nat_tftp
nf_nat 36864 3 nf_nat_ipv6,nf_nat_ipv4,nf_nat_tftp
nf_conntrack 155648 10 nf_conntrack_ipv6,nf_conntrack_ipv4,nf_nat,nf_conntrack_tftp,nft_ct,nf_nat_ipv6,nf_nat_ipv4,nf_nat_tftp,nft_masq,nft_masq_ipv4
One of the article linked above suggests the following using iptables (which makes sense):
iptables -A PREROUTING -t raw -p udp --dport 69 -s 192.168.11.0/24 -d 172.16.0.0/16 -j CT --helper tftp
How would I do the equivalent with firewalld with an nft backend.
Update:
The firewalld configuration is fairly complex, so I'm only adding the relevant zones:
The external zone:
<?xml version="1.0" encoding="utf-8"?>
<zone>
<source address="10.0.10.0/24"/>
<service name="tftp-client"/>
<service name="ssh"/>
<masquerade/>
</zone>
And the internal zone:
<?xml version="1.0" encoding="utf-8"?>
<zone>
<source address="192.168.1.0/24"/>
<service name="dhcp"/>
<service name="ssh"/>
<service name="dns"/>
<service name="tftp"/>
<masquerade/>
</zone>
Note: the Masquerade on the internal zone is a mistake. I removed it, but the behavior is not changing.
Zone drifting is disabled.
Update 2:
To answer a request from a commenter:
DHCP configuration
The DHCP server is running on the same system as the NAT router (192.168.1.1 in the network diagram). It is standard ISC DHCP, handing out IP addresses (as fixed-address; there is no pool involved), mask, gateway, DNS server, etc., as well as the PXE Boot next-server and filename options.
All this obviously works. tcpdump shows that the client sends the correct RRQ packet to the server.
The response arrives back at the NAT router, but does not get sent to the behind-the-NAT side.
Details about how TFTP works and how it breaks with NAT
If you understand the TFTP protocol, it is fairly clear what is happening; I just do not know how to handle it with firewalld/nft/CentOS 8.
Fundamentally, the problem is that the TFTP protocol uses UDP ports in a non-standard way. In "standard" UDP-based protocols such as DNS, the response comes from the same port that the server listens on.
Request: client:54321 -> server:53
Response: server:53 -> client:54321
(where 54321 can be any random ephemeral port number picked by the client).
NAT matches up those IP addresses and ports to identify which response belongs to which request.
TFTP does it differently; the responses do not come from port 69, but some other random port.
Request (RRQ): client54321 -> server:69
Response (Data): server:12345 -> client:54321
Where 54321 is again a random ephemeral port the client chooses, and 12345 is a random ephemeral port the server chooses.
As a result, standard NAT behavior will not find a connection matching an origin server:12345, and drops the packet.
The solution to this problem involves using a helper - the nf_nat_tftp kernel module that understands this quirk.
I just have not been able to figure out how to implement this using CentOS 8, nftables and firewalld.
An answer that uses nftables is perfectly acceptable for me, as long as it does not break any firewalld rules.
Reason it's not working
It appears firewalld might be geared to handle firewalling local services, rather than routed services.
So the tftp settings will add in the end these nft rules when firewalld has been configured (on CentOS 8) with the zones files in OP (just showing the rules, not the whole ruleset here):
Those rules will never match and are thus useless: they are in the input path, not in the forward path.
With the running firewall, these (blindly copied) rules added at the right place: in the forward path, will make TFTP work:
So in the end a so-called direct option would still be an option so everything is stored in firewalld's configuration. Alas the documentation is a bit misleading:
Not reading carefully one would think with
FirewallBackend=nftables
that it would behave differently by accepting nftables rules, but that's not the case:No need to test much more, this "feature" is documented there:
https://bugzilla.redhat.com/show_bug.cgi?id=1692964
and there:
https://github.com/firewalld/firewalld/issues/555
Direct rules still use iptables with the nftables backend. The CAVEAT is about the order of rules evaluation.
Handle this in an other table
I don't see the point anymore of doing this with firewall-cmd, which will add iptables rules along nftables rules. It just becomes cleaner to add an independent table. It'll just be in the ip family since filters for the specific IPv4 networks will also be added (inet would also be fine).
handletftp.nft
(to be loaded withnft -f handletftp.nft
):As the table is different and the ruleset is never flushed, but instead the specific table is (atomically) deleted and recreated, this doesn't affect firewalld nor firewalld will affect it.
The priority doesn't matter much: that this chain is traversed before or after firewalld's chains won't change the fate of the packet (still in the hands of firewalld). Whatever the order, if the packet is accepted by firewalld it will also have activated the helper for this flow.
If you choose to use the nftables service to load this table, you'll have to edit it (eg:
systemctl edit --full nftables
), because beside loading some probably inadequate default rules, it will flush all rules on stop or reload, disrupting firewalld.Now, a TFTP transfer will work and activate the specific helper, as can be checked by running two
conntrack
commands during the transfer:The 3rd NEW entry in the example above is actually tagged as RELATED (that's the whole role of the tftp helper: expect a certain type of packet to get it seen as related) which will be accepted by the firewall.