Ping a Specific Port

Question

IMB

Asked: 2012-09-06 19:42:36 +0800 CST2012-09-06 19:42:36 +0800 CST 2012-09-06 19:42:36 +0800 CST

If DNS Failover is not recommended, what is?

772

As a followup question to his very popular question: Why is DNS failover not recommended?, I think it was agreed that DNS failover is not 100% reliable due to caching.

However the highest voted answer did not really discuss what is the better solution to achieve failover between two different data centers. The only solution presented was local load balancing (single data center).

So my question is quite simply what is the real solution to cross data center failover?

3 Answers

Voted

symcbean · Answer 1 · 2012-09-07T03:49:50+08:00

This started off as a comment...but it's getting too long.

Sadly most of the answers to the previous question are wrong: they assume that the failover has something to do with the TTL. The top voted answer is SPECTACTULARLY wrong, and notably cites no sources. The TTL applies to the zone record as a whole and has nothing to do with Round Robin.

From RFC 1794 (which is all about Round Robin DNS serving)

There is no use in handing out information with TTLs of an hour [or less]

(IME it's nearer to 3 hours before you get full propogation).

From RFC 1035

When several RRs of the same type are available for a
 particular owner name, the resolver should either cache them
 all or none at all

RFC 1034 set out the requirements for Negative caching - a method for indicating that all requests must be served fresh from the authoritative DNS server (in which case the TTL does control failover) - in my experience support for this varies.

Since any failover would have to be implemented high in the client stack, it's arguably not part of TCP/IP or DNS - indeed, SIP, SMTP, RADIUS and other protocols running on top of TCP/IP define how the client should work with Round Robin - RFC 2616 (HTTP/1.1) is remarkable in not mentioning how it should behave.

However, in my experience, every browser and most other HTTP clients written in the last 10 years will transparently check additional A RRs if the connection appears to be taking longer than expected. And it's not just me:

Failover times vary by implementation but are in the region of seconds. It's not an ideal solution since (due to the limits of DNS) publishing of failed node takes the DNS TTL - in the meantime you have to rely on client side detection.

Round-Robin is not a substitute for other HA mechanisms within a site. But it does complement it (the guys who wrote HAProxy recommend using a pair of installations accessed via round robin DNS). It is the best supported mechanism for implementing HA across multiple sites: indeed, as far as I can determine, it is the only supported mechansim for failover available on standard clients.

Skaperen · Answer 2 · 2012-09-06T19:56:03+08:00

Best Answer

Skaperen

2012-09-06T19:56:03+08:002012-09-06T19:56:03+08:00

A whole data center would need to go down or be unreachable for this to apply. Your backup at another data center would then be reached by routing the IP addresses to the other data center. This would happen through the BGP route announcements from the primary data center no longer being provided. The secondary announcements from the secondary data center would then be used.

Smaller businesses are generally not large enough to justify the expense of portable IP address allocations and their own autonomous system number to announce BGP routes with. In this case a provider would multiple locations is the way to go.

You either have to be reached via your original IP addresses, or via a change of IP address done by DNS. Since DNS is not designed to do this in the ways needed by what "failover" means (users can be out of reach by at least as long as your TTL, or the TTL imposed by some caching servers), going to the backup site with the same IPs is the best solution.

5

Ben Lessani · Answer 3 · 2012-09-07T05:17:48+08:00

Ben Lessani

2012-09-07T05:17:48+08:002012-09-07T05:17:48+08:00

The simplest approach to dual DC redundancy would be a L2 MPLS VPN between the two sites, along with maintaining the BGP sessions between the two.

You essentially can then just have a physical IP per server and a virtual IP that floats between the two (HSRP/VRRP/CARP etc.). Your DNS would be routed to this particular IP and directed accordingly.

The next consideration would be split brain - but that's another question for another time.

Juniper wrote a good white paper on dual DC management with MPLS, you can grab the PDF here http://www.juniper.net/us/en/local/pdf/whitepapers/2000407-en.pdf

2

If DNS Failover is not recommended, what is?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?