I'm running my own NAT instance inside of an AWS VPC. I want to make sure that the NAT instance will not be a bottleneck and to that end would like to set my own expectation for when to scale to a secondary NAT (if ever).
I understand that instance type (currently an m1.medium if that's important) is an important aspect of this but would like to know how to check that the NAT instance is starting to hit its maximum and whether or not I could be achieving better throughputs for machines in the VPC if some of them were using a different NAT instance.
It's NATting through a pretty simple iptables directive as shown below:
$ iptables --table nat --list
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- 10.0.0.0/16 anywhere
There are two things it could bottleneck on: CPU and bandwidth. Monitoring CPU should be simple, and easy to tell when it's hit the limit. Bandwidth should be simple to monitor, but more difficult to tell when it hits the limit, since EC2 doesn't guarantee a certain amount of network bandwidth. You could do some bandwidth tests and deduce a limit from that.
In any case, network problems will show on the clients in the form of TCP retransmissions. Monitoring this on all the clients will tell you when there's a problem, but not necessarily whether it's with your NAT instance or something else.