I have two dedicated servers: "web" (YYY.YYY.YYY.YYY) and "monitor" (XXX.XXX.XXX.XXX). Both are in the same network of a mass hoster (hetzner).
Now on "web" I have 3 prometheus metric endpoints running : docker-engine (9323) on the bare metal host and neo4j (2004), telegraf (9273) as docker containers. Both docker containers map their output ports to the host correctly, so the following calls executed on "web" work:
lynx http://YYY.YYY.YYY.YYY:9323/metrics => OK
lynx http://YYY.YYY.YYY.YYY:9273/metrics => OK
lynx http://YYY.YYY.YYY.YYY:2004/metrics => OK
But calling those endpoints from "monitor" server works only for the bear metal service docker-engine (9323)
lynx http://YYY.YYY.YYY.YYY:9323/metrics => OK
lynx http://YYY.YYY.YYY.YYY:9273/metrics => timeout
lynx http://YYY.YYY.YYY.YYY:2004/metrics => timeout
UFW status verbose delivers the following
Status: active
Logging: on (low)
Default: deny (incoming), allow (outgoing), deny (routed)
New profiles: skip
To Action From
-- ------ ----
[...]
9323/tcp ALLOW IN XXX.XXX.XXX.XXX
9273/tcp ALLOW IN XXX.XXX.XXX.XXX
2004/tcp ALLOW IN XXX.XXX.XXX.XXX
[...]
There are no other rules with those IPs and no general rules which apply to subnets, interfaces etc. All other rules are for discreet ports, likie 22, 80, 443 etc.
The strange thing is, that it worked just few hours before. In the meantime I was experimenting a little with this here https://medium.com/@pitapun_44686/what-is-the-best-practice-of-docker-ufw-under-ubuntu-69e11c826b31 and appended the following block to the very end of /etc/ufw/after.rules
*filter
:ufw-user-forward - [0:0]
:DOCKER-USER - [0:0]
-A DOCKER-USER -j RETURN -s 10.0.0.0/8
-A DOCKER-USER -j RETURN -s 172.16.0.0/12
-A DOCKER-USER -j RETURN -s 192.168.0.0/16
-A DOCKER-USER -j ufw-user-forward
-A DOCKER-USER -j DROP -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -d 192.168.0.0/16
-A DOCKER-USER -j DROP -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -d 10.0.0.0/8
-A DOCKER-USER -j DROP -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -d 172.16.0.0/12
-A DOCKER-USER -j DROP -p udp -m udp --dport 0:32767 -d 192.168.0.0/16
-A DOCKER-USER -j DROP -p udp -m udp --dport 0:32767 -d 10.0.0.0/8
-A DOCKER-USER -j DROP -p udp -m udp --dport 0:32767 -d 172.16.0.0/12
-A DOCKER-USER -j RETURN
COMMIT
Now I commented it out and restarted ufw. Those ports 9273 and 2004 are stil not accessible, so this was not the reason.
I configured ufw log level to high, but I can not see any connection attepmtps or dropped packets from host XXX.XXX.XXX.XXX.
Trying to telnet into the one working port (telnet YYY.YYY.YYY.YYY 9323) I can see the communication in the ufw logs, but not for tho other two ports.
[UFW AUDIT] SRC=XXX.XXX.XXX.XXX DST=YYY.YYY.YYY.YYY DPT=9323 =>
[UFW AUDIT] SRC=YYY.YYY.YYY.YYY DST=XXX.XXX.XXX.XXX SPT=9323
I provisioned ufw using ansible "ufw" module.
What other reasons could there be? What is going on? :-)
Could it be the hosters network imposes some kind of filter between those servers because of "suspisious" activity (frequent communication)? It also happens, that I run some excessive artillery.io tests from XXX.XXX.XXX.XXX to port 80/443 on YYY.YYY.YYY.YYY today, but this hypothesis does not explain why only those two ports are not working anymore.
And the ultimate test, shutdown of ufw on YYY.YYY.YYY.YYY, does not hel either. Ports 9273 and 2004 are not accessible, 9323 is.
Here is the output from iptables -L -v -n
https://pastebin.com/HVeJGXb9