Due to a series of poor network design decisions (mostly) made many years ago in order to save a few bucks here and there, I have a network that is decidedly sub-optimally architected. I'm looking for suggestions to improve this less-than-pleasant situation.
We're a non-profit with a Linux-based IT department and a limited budget. (Note: None of the Windows equipment we have runs does anything that talks to the Internet nor do we have any Windows admins on staff.)
Key points:
- We have a main office and about 12 remote sites that essentially double NAT their subnets with physically-segregated switches. (No VLANing and limited ability to do so with current switches)
- These locations have a "DMZ" subnet that are NAT'd on an identically assigned 10.0.0/24 subnet at each site. These subnets cannot talk to DMZs at any other location because we don't route them anywhere except between server and adjacent "firewall".
- Some of these locations have multiple ISP connections (T1, Cable, and/or DSLs) that we manually route using IP Tools in Linux. These firewalls all run on the (10.0.0/24) network and are mostly "pro-sumer" grade firewalls (Linksys, Netgear, etc.) or ISP-provided DSL modems.
- Connecting these firewalls (via simple unmanaged switches) is one or more servers that must be publically-accessible.
- Connected to the main office's 10.0.0/24 subnet are servers for email, tele-commuter VPN, remote office VPN server, primary router to the internal 192.168/24 subnets. These have to be access from specific ISP connections based on traffic type and connection source.
- All our routing is done manually or with OpenVPN route statements
- Inter-office traffic goes through the OpenVPN service in the main 'Router' server which has it's own NAT'ing involved.
- Remote sites only have one server installed at each site and cannot afford multiple servers due to budget constraints. These servers are all LTSP servers several 5-20 terminals.
- The 192.168.2/24 and 192.168.3/24 subnets are mostly but NOT entirely on Cisco 2960 switches that can do VLAN. The remainder are DLink DGS-1248 switches that I am not sure I trust well enough to use with VLANs. There is also some remaining internal concern about VLANs since only the senior networking staff person understands how it works.
All regular internet traffic goes through the CentOS 5 router server which in turns NATs the 192.168/24 subnets to the 10.0.0.0/24 subnets according to the manually-configured routing rules that we use to point outbound traffic to the proper internet connection based on '-host' routing statements.
I want to simplify this and ready All Of The Things for ESXi virtualization, including these public-facing services. Is there a no- or low-cost solution that would get rid of the Double-NAT and restore a little sanity to this mess so that my future replacement doesn't hunt me down?
Basic Diagram for the main office:
These are my goals:
- Public-facing Servers with interfaces on that middle 10.0.0/24 network to be moved in to 192.168.2/24 subnet on ESXi servers.
- Get rid of the double NAT and get our entire network on one single subnet. My understanding is that this is something we'll need to do under IPv6 anyway, but I think this mess is standing in the way.
1.) Before basically anything else get your IP addressing plan straightened out. It's painful to renumber but it's the necessary step to arrive at a workable infrastructure. Set aside comfortably large, easily summarized supernets for workstations, servers, remote sites (with unique IP's, naturally), management networks, loopbacks, etc. There's a lot of RFC1918 space and the price is right.
2.) It's hard to get a sense of how to lay out L2 in your network based on the diagram above. VLAN's may not be necessary if you've got sufficient numbers of interfaces in your various gateways as well as sufficient numbers of switches. Once you've got a sense of #1 it might make sense to reapproach the L2 question separately. That said, VLAN's aren't an especially complex or novel set of technologies and needn't be that complicated. A certain amount of basic training is in order, but at a minimum the ability to separate a standard switch into several groups of ports (i.e. without trunking) can save a lot of money.
3.) The DMZ hosts should probably be placed onto their own L2/L3 networks, not merged in with workstations. Ideally you'd have your border routers connected to a L3 device (another set of routers? L3 switch?) which, in turn, would connect a network containing your externally facing server interfaces (SMTP host, etc). These hosts would likely connect back to a distinct network or (less optimally) to a common server subnet. If you've laid out your subnets appropriately then the static routes required to direct inbound traffic should be very simple.
3a.) Try to keep the VPN networks separate from other inbound services. This will make things easier as far as security monitoring, troubleshooting, accounting, etc.
4.) Short of consolidating your Internet connections and/or routing a single subnet via several carriers (read: BGP) you'll need the intermediate hop before your border routers to be able to redirect in- and out- bound traffic appropriately (as I suspect you're doing at the moment). This seems like a bigger headache than VLAN's, but I suppose it's all relative.