I am planning to deploy an infrastructure of 11 nodes using Opscode Chef, providing for a high availability web application. I would like to spread the nodes across datacentres for availability, for which I'm thinking rrDNS which resolves to one of two load-balancers, each in a separate datacentre with its own clustered nodes (providing the application with nginx, memcached and Sphinx). A third DC will be used for a MySQL master/slave arrangement, as I have read replication does not perform well over WAN. This solution enables no single point of failure.
My question pertains to how these nodes should connect to one another? All the information transferred by these services is generally expected to go over short LAN connections, hence no in-built security is provided, meaning I will need to provide this on the links themselves.
I was thinking of doing this with SOCKS tunnelling or VPN. The latter would double as increasing security for the nodes themselves, as then they do not need to expose several services on their Internet interface, instead just exposing say an instance of OpenVPN.
What are the thoughts on solutions for providing links between nodes in this sort of infrastructure?
If this is node-to-node communication and you're not in a situation where you can establish private links (or a firewall-based VPN tunnel), I'd suggest a peer-to-peer client solution called n2n from ntop.
This all needs to be controlled with a supernode, but that can easily be the Chef system. I use this approach for SNMP monitoring of systems where I don't have VPN or control of the remote-side firewalls.
So in my opinion you have your answer. A VPN is the way to do it. This will work with pretty much any data center or cloud solution.
Now if you are colocating or have your own infrastructure you have another option that might be nice. An EVPL (Ethernet Virtual Private Line), basically a dedicated connection between sites, most major ISPs can provide a solution for this. You can also look into private lines.
I have used EVPLs before and they provide a great solution for this type of inter-site communication. To add redundancy a good idea would be to also create a VPN as a secondary path for data if the EVPL goes down.