I have a VPS running multiple docker containers. I have a nextcloud instance, which get's SSL/TLS termination by an nginx proxy (certificates from Let'sEncrypt). And I have an openvpn container. In it's docker network I also host further services (own bind dns server and a git server), that I can reach trough the VPN.
Now I also want to reach my nextcloud instance through the VPN. Originally I thought this wouldn't be a problem, since the nextcloud instance can be reached through the internet and the VPN also gives connection to the internet. But unfortunately I cannot reach it. If I curl my server (http or https) through the VPN I get "port 80/443: No route to host". Without connecting the VPN the connection works correctly.
If I use traceroute, I can see, that it correctly reaches the public IP of my VPS. So I conclude, that it is a problem with the routing. The traffic targeted to port 80/443 on the public IP of my VPS don't get forwarded/routed to the nginx proxy container (which exposed the mentioned ports).
As I understood, docker uses firewalld/iptables to route the traffic between and to containers. Thus other rules are applied to the VPN traffic, than the traffic comming from the internet. What do I need to configure how, so that the VPN traffic (server internal) to my public IP address is forwarded correctly to the corresponding container? I would like to maintain constant/unchanged connectivity between VPN & No-VPN states, so that my Nextcloud app does not get confused.
What I have tried: I tried out possibilities of a workaround. I could add an own DNS entry for my nextcloud instance in my VPN DNS server, which points to the IP of the nextcloud app container (where I would loose SSL/TLS termination) or of the nginx proxy. In the last case the nginx proxy does not forward the traffic to the nextcloud container, since it uses a different hostname. I want to leave the proxy configuration unchanged, if possible, since it is automatically filled at container startup/from the letsencrypt companion container. Also the certificates would not match up with the used FQDN. If I try to add a master zone with my real/public DNS name (so that I can use the same FQDN as from the outside), all other domains from that TLD don't get forwarded anymore (Is there a possibility to configure bind for that?).
TL;DR: Traffic from a docker container to the public VPS IP address does not get forwarded to the correct docker container, like traffic from outside does.
If you also need more information about the used containers, I will add links and my docker-compose files.
EDIT:
[root@XXXXXXXX ~]# iptables -S FORWARD
-P FORWARD ACCEPT
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A FORWARD -o br-7e5cecc96f4a -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o br-7e5cecc96f4a -j DOCKER
-A FORWARD -i br-7e5cecc96f4a ! -o br-7e5cecc96f4a -j ACCEPT
-A FORWARD -i br-7e5cecc96f4a -o br-7e5cecc96f4a -j ACCEPT
-A FORWARD -o br-fd56ce52983e -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o br-fd56ce52983e -j DOCKER
-A FORWARD -i br-fd56ce52983e ! -o br-fd56ce52983e -j ACCEPT
-A FORWARD -i br-fd56ce52983e -o br-fd56ce52983e -j ACCEPT
-A FORWARD -o br-f1ef60d84b48 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o br-f1ef60d84b48 -j DOCKER
-A FORWARD -i br-f1ef60d84b48 ! -o br-f1ef60d84b48 -j ACCEPT
-A FORWARD -i br-f1ef60d84b48 -o br-f1ef60d84b48 -j ACCEPT
-A FORWARD -o br-b396aa5a2d35 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o br-b396aa5a2d35 -j DOCKER
-A FORWARD -i br-b396aa5a2d35 ! -o br-b396aa5a2d35 -j ACCEPT
-A FORWARD -i br-b396aa5a2d35 -o br-b396aa5a2d35 -j ACCEPT
-A FORWARD -o br-83ac9a15401e -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o br-83ac9a15401e -j DOCKER
-A FORWARD -i br-83ac9a15401e ! -o br-83ac9a15401e -j ACCEPT
-A FORWARD -i br-83ac9a15401e -o br-83ac9a15401e -j ACCEPT
-A FORWARD -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.122.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i lo -j ACCEPT
-A FORWARD -j FORWARD_direct
-A FORWARD -j FORWARD_IN_ZONES_SOURCE
-A FORWARD -j FORWARD_IN_ZONES
-A FORWARD -j FORWARD_OUT_ZONES_SOURCE
-A FORWARD -j FORWARD_OUT_ZONES
-A FORWARD -m conntrack --ctstate INVALID -j DROP
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
Docker by default does not allow traffic between any two of its containers that are connected to different bridges. And also it does not allow traffic from a container to a port that has been mapped to the outside by docker itself. This is all implemented with iptables.
First off, the mapping of a port to the outside also happens with iptables. It uses a
DNAT
rule in the nat table. For these rules Docker creates a separateDOCKER
chain, so that the same rules apply fromPREROUTING
orOUTPUT
in the nat table. TheDNAT
rules are preceded withRETURN
jumps that filter out all traffic coming from a Docker bridge. So that is the first hurdle.It looks a bit like this:
The
DNAT
rule can also have a-d address
if you exposed the port to that local address only. No traffic from any Docker bridge can hit theDNAT
rule(s) because of theRETURN
rules before that. And also on top of that, theDNAT
rule does not allow aDNAT
back through the same bridge the traffic came from. Which wouldn't be necessary anyway, because from the same bridge you can just reach the INTERNALPORT already.The restriction on traffic between containers on different bridges is implemented in the filter table of iptables. Two custom chains are at the beginning of the
FORWARD
chain, and the default policy of that chain isDROP
. One is for containers with user-defined bridges, the other for containers with Docker bridges:DOCKER-ISOLATION-STAGE-1
. That chain again usesDOCKER-ISOLATION-STAGE-2
. The combination of both says basically that if traffic leaves a Docker bridge and then enters another Docker bridge, thenDROP
it (without ICMP signaling, so the connection just hangs.....)It looks like this:
So, if you want traffic from bridge one, to hit a
DNAT
for a port exposed on the outside by a container on bridge two and you want the traffic to return for a full connection, then you have to do a couple of things:Drop the
RETURN
rules that stops the traffic fromDNAT
in theDOCKER
chain in the nat table. You HAVE to remove theRETURN
for the source bridge. You CAN leave theRETURN
for the destination bridge, if you don't want to allow a container from that bridge to access aDNAT
exposed port.iptables -t nat -D DOCKER -i br-one -j RETURN
iptables -t nat -D DOCKER -i br-two -j RETURN
#Optional if br-one -> br-twoRemove the
DROP
rules for both bridges from theDOCKER-ISOLATION-STAGE-2
chain in the filter table.iptables -t filter -D DOCKER-ISOLATION-STAGE-2 -o br-one -j DROP
iptables -t filter -D DOCKER-ISOLATION-STAGE-2 -o br-two -j DROP
Now the lines are open.
Docker does not often refresh its rules (at least not in the 19.03 version I tested with). It seems it only rebuilds the rule sets when the docker daemon restarts, not when you stop or start or create a container. You could try to tack any changes to the service restart to keep them persistent.