This problem from what i can tell is isolated to PowerDNS. The servers are running two packages pdns-static-3.0.1-1.i386.rpm
and pdns-recursor-3.3-1.i386.rpm
on the most recent version of Amazon Linux.
The amazon ec2 loadbalancers are assigned a CNAME with multiple hosts. Below is an example of the actual behavior. Notice how the hosts are always in the same order.
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
Expected behavior is round robin for the hosts
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
[root@localhost ~]# host cache.domain.com
cache.domain.com is an alias for xxxxx.us-east-1.elb.amazonaws.com.
xxxxx.us-east-1.elb.amazonaws.com has address aaa.aaa.aaa.aaa
xxxxx.us-east-1.elb.amazonaws.com has address bbb.bbb.bbb.bbb
The addresses eventually do swap but it seems to be on a 30 minute cache timer changing the TTL of the record doesn't appear to affect anything. It appears as though the resolver has a cache of the response. This adversely affects my application because all of the load is only being sent to one of the loadbalancers (Availability Zones) so if I have servers in two zones then only one zone is under load at a time.
Do you know how I can fix this so that each time the host is resolved the order of the addresses is alternating.
DIG OUTPUT
; DiG 9.7.6-P1-RedHat-9.7.6-1.P1.18.amzn1 cache.domain.com ;; global options: +cmd ;; Got answer: ;; HEADER opcode: QUERY, status: NOERROR, id: 54610 ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: cache.domain.com. IN A ;; ANSWER SECTION: cache.domain.com. 100 IN CNAME xxxxx.us-east-1.elb.amazonaws.com. xxxxx.us-east-1.elb.amazonaws.com. 3 IN A aaa.aaa.aaa.aaa xxxxx.us-east-1.elb.amazonaws.com. 3 IN A bbb.bbb.bbb.bbb ;; Query time: 0 msec ;; SERVER: ccc.ccc.ccc.ccc#53(ccc.ccc.ccc.ccc) ;; WHEN: Mon Jul 2 15:09:27 2012 ;; MSG SIZE rcvd: 130
Recursor config
allow-from=0.0.0.0/0 dont-query= local-address=127.0.0.1 local-port=530 # Port should be changed to 530 because its not good to run on the same port as dns server quiet=yes setgid=pdns setuid=pdns disable-packetcache= packetcache-ttl=0 forward-zones=domain.local=LOCALIP,domain.cloud=LOCALIP # Forward the two zones we care about back to the local dns server forward-zones-recurse=amazonaws.com=172.16.0.23,compute-1.internal=172.16.0.23 # Forward queries for amazons domains to the resolver for amazon
SOLUTION
add the following lines to recursor.conf
disable-packetcache=
packetcache-ttl=0
add the following line to pdns.conf
recursive-cache-ttl=0
The PowerDNS Recursor caches at two levels.
It caches responses from authoritative servers for up to the TTL specified in the response it got (limited by max-cache-ttl but never exceeding the TTL it got from an auth).
Additionally, when a response packet from the recursor to a client (your clients that are generating load) is generated and sent, this packet is cached as a whole, so that the same question can be answered extremely quickly (without any parsing) if it comes in again. This is called the packetcache.
Shuffling happens in between these two levels. This means that your results are in fact shuffled, but their shuffle order is kept stable by the packetcache (for up to an hour, by default). If you want per-response shuffle, set 'disable-packetcache' or 'packetcache-ttl=0'.
Not necessarily a "fix" - but do you need to use the CNAME from your application rather than directly querying the underlying A record? Presumably the CNAME => A record mapping doesn't change that often.
Sometimes the simplest dumb fixes are actually enough, and avoid having to solve all the worlds' problems just to get the results you need!