I want to structure a high available server cluster . Now I want to know detail about keepalive and heartbeat, what is the difference between both, and How to choice one.
Based on the Wikipedia description of Anycast, it includes both the distribution of a domain-name-to-many-IP-mapping across many DNS servers as well as replying to clients with the most geographically close (or fastest) server.
In the context of a globally distributed, highly available site like google.com (or any CDN service with many global edge locations) this sounds like the two key features one would need.
DNS services like Amazon's Route53, EasyDNS and DNSMadeEasy all advertise themselves as Anycast-enabled networks.
Therefore my assumption is that each of these DNS services transparently offer me those two killer features: multi-IP-to-domain mapping AND routing clients to the closest node.
However, each of these services seem to separate out these two functionalities, referring to the 2nd one (routing clients to closest node) as "GeoDNS", "GeoIP" or "Global Traffic Director" and charge extra for the service.
If a core tenant of an Anycast-capable system is to already do this, why is this functionality being earmarked as this extra feature? What is this "GeoDNS" feature doing that a standard Anycast DNS service won't do (according to the definition of Anycast from Wikipedia -- I understand what is being advertised, just not why it isn't implied already).
I get extra-confused when a DNS service like Route53 that doesn't support this nebulous "GeoDNS" feature lists functionality like:
Fast – Using a global anycast network of DNS servers around the world, Route 53 is designed to automatically route your users to the optimal location depending on network conditions. As a result, the service offers low query latency for your end users, as well as low update latency for your DNS record management needs.
... which sounds exactly like what GeoDNS is intended to do, but geographically directing clients is something they explicitly don't support it yet.
Ultimately I am looking for the two following features from a DNS provider:
- Map multiple IP addresses to a single domain name (like google.com, amazon.com, etc. does)
- Utilize a DNS service that will respond to client requests for that domain with the IP address of the nearest server to the requestee.
As mentioned, it seems like this is all part of an "Anycast" DNS service (all of which these services are), but the features and marketing I see from them suggest otherwise, making me think I need to learn a bit more about how DNS works before making a deployment choice.
Thanks in advance for any clarifications.
Are there any major alternatives for automatic failover on Linux besides the typical Heartbeat/Pacemaker/CoroSync combinations? In particular, I'm setting up failover on EC2 instances, which only supports unicast - no multicast or broadcast. I'm specifically trying to handle the few pieces of software we have which don't already have automatic failover and don't support multi-master environments. This includes tools like HAProxy and Solr.
I have Heartbeat+Pacemaker working, but I'm not thrilled with it. Here are some of my issues:
- Heartbeat - By itself, limited to two nodes. I'd like to have 3+.
- Pacemaker - Impossible to configure automatically. Cluster has to be running with a quorum and then it still requires manual configuration.
- CoroSync - Does not support unicast.
Pacemaker works very well, although it's power makes it difficult to setup. The real problem with Pacemaker is that there is no easy way to automate the configuration. I really want to launch an EC2 instance, install Chef/Puppet and have the entire cluster launch without my intervention.
We recently saw an issue after a fail over of our router where our Windows 2008 Boxes didn't start talking to the primary router after fail-back.
When we did some digging they still had the ARP entry from the secondary router. According to the TechNet Blog this is by-design:
First, a Windows Vista or Windows Server 2008 will not update the Neighbor cache if an ARP broadcast is received unless it is part of a broadcast ARP request for the receiver. What this means is that when a gratuitous ARP is sent on a network with Windows Vista and Widows Server 2008, these systems will not update their cache with incorrect information if there is an IP address conflict.
Secondly, it appears that the windows neighbor-cache (arp-cache) is only updated if the machine can no longer talk to the machine that is in it's cache currently. It does not send out occasional ARP requests to make sure the cache is not stale. While this isn't an issue during the initial fail over, during fail back when both boxes are alive this causes windows to keep talking to the secondary box.
Is there any way to force Windows 2008 to accept Gratuitous ARP requests?
We have a small datacenter with about a hundred hosts pointing to 3 internal DNS servers (bind 9). Our problem comes when one of the internal DNS servers becomes unavailable. At that point all the clients that point to that server start performing very slowly.
The problem seems to be that the stock Linux resolver doesn't really have the concept of "failing over" to a different DNS server. You can adjust the timeout and number of retries it uses, (and set rotate so it will work through the list), but no matter what settings one uses our services perform much more slowly if a primary DNS server becomes unavailable. At the moment this is one of the largest sources of service disruptions for us.
My ideal answer would be something like "RTFM: tweak /etc/resolv.conf like this...", but if that's an option I haven't seen it.
I was wondering how other folks handled this issue?
I can see 3 possible types of solutions:
Use linux-ha/Pacemaker and failover IPs (so the DNS IP VIPs are "always" available). Alas, we don't have a good fencing infrastructure, and without fencing pacemaker doesn't work very well (in my experience Pacemaker lowers availability without fencing).
Run a local DNS server on each node, and have resolv.conf point to localhost. This would work, but it would give us a lot more services to monitor and manage.
Run a local cache on each node. Folks seem to consider nscd "broken", but dnrd seems to have the right feature set: it marks DNS servers as up or down, and won't use 'down' DNS servers.
Any-casting seems to work only at the IP routing level, and depends on route updates for server failure. Multi-casting seemed like it would be a perfect answer, but bind does not support broadcasting or multi-casting, and the docs I could find seem to suggest that multicast DNS is more aimed at service discovery and auto-configuration rather than regular DNS resolving.
Am I missing an obvious solution?