I have several VPC's set up in AWS, and all of my instances use provisioned IP addresses, that is - not using Elastic IP Addresses
.
When any given instance boots up, it executes a script on the machine (post networking), which gets the Instance ID, Zone ID (from local config), and region, etc - once it has this information, it updates Route 53
to update the DNS information in a private hosted zone for these instances.
The reason for this is basically so that I can use DNS for server connection strings. I have my Web server and DB servers in a private subnet, and when the Web Server connects to the DB - it just uses the staticdns.mydomain.private
which maps to the instance private IP address. This way, it doesn't require an amount of reconfiguration when the instance gets rebooted or if the IP changes for other reasons.
This is all well and good, and it works - with one caveat. There is a delay in the resolving of the new DNS mappings, I am not sure how long it is - it isn't VERY long, but it seems to be somewhat random (TTL maybe?). For this period that the resolver has the OLD IP cached, we will get connection failures from Web Server to the Database. I would much prefer that this cache was released when it was updated, but I have no clue where to even search for that.
Does anyone know if there is a way to refresh the DNS resolver cache within private zones in Route 53? I have tried using nscd
also on the server, which did not seem to help.
Couple of options and notes...
If the servers are in the same VPC or in peered VPCs use their private IPs for communication, not the public ones. Private IPs stay the same when instance is stopped/restarted.
The old records are cached on the hosts, not in Route 53. You'd have to flush the
nscd
cache all the other hosts once one IP has changed, that's quite a lot of automation to be done. Besides some apps and frameworks also cache the records outside of nscd so it's quite hard to flush everything when needed.You can lower the TTL of your DNS records to 60 (= 1 minute), that means the resolved records won't be cached for more than a minute. That's the same approach that AWS RDS uses for a fail-over mechanism.
Use Network Load Balancer (NLB) - it will provide a stable IP for your server even if the server's actual IP changes. However it's quite an overkill.
Use Elastic IPs. That would solve your problem too. They cost nothing when attached to a running instance.
Use AWS RDS, possibly Serverless Aurora that costs next to nothing when not in use. All the management, failover, etc will be done for you.