I'm designing an architecture for a system (built on AWS) which will have multiple different production environments, in different zones.
Initially I had thought that it would be a good idea to use 1 VPC per environment, with another Operations and Maintenance (OAM) VPC providing a bastion host and used for ssh connections to the environment VPCs (which can then block all inbound ssh traffic from the public internet). Each environment (env) VPC would have VPC peering to the OAM VPC.
The problem with this approach is that if I want to access all env VPCs from the OAM VPC, they must each have different CIDR ranges. This has two implications:
- I need to keep track of which environment has which CIDR range
- I will be boxing myself in for future scalability - either the CIDR ranges are too few and too large, in which case I might reach the limit of number of environments, or too small and too numerous, in which case I will limit the number of things I can put in each environment.
The alternative approach is to keep all the env VPCs completely isolated from each other, which means that each one can have identical CIDR ranges. To me this seems like an advantage, because identical environments means easier maintenance and fewer human errors. Plus we can fit more things in each environment. But to access them I would have to add a bastion host in each one, which is a) less secure and b) wasteful.
Is there a best practice on how these two conflicting demands (security and conformity) should be reconciled?
Don't use overlapping CIDR blocks as it limits your connectivity options - Transit Gateway and VPC peering won't work with overlapping ranges. You'll make things difficult for yourself.
Private address spaces are virtually unlimited. Assign each account a /16 block, assign each VPC a /8, or whatever they need. This won't work as well if you need to integrate with on-premise CIDR ranges. If you need public facing that still works fine. I keep CIDR information recorded in a Word document which contains my AWS design, but you could use a spreadsheet.
AWS VPCs can have additional CIDR ranges added if you run out of space.
You should look into using AWS Control Tower and a multi-account environment, rather than multi-VPC. Read up on the advantages, but manageability and blast radius are key. It takes more work to automate, but it will scale very well.
I typically have one AWS account per application per environment, so if I have three applications and four environments (dev, test, pre-prod, prod) I would have 12 accounts. I also use networking, security, logging, audit, and platform sandbox accounts. You secure it with Guard Duty, AWS Config, Security Hub, and best created with CloudFormation and a pipeline of some sort.