TL;DR
CentOS6 NAT router/firewall behind a 120Mbps cable modem connection seems to be capping throughput at 30Mbps after recent updates and security "hardening".
Prior to updates and hardening I was getting 90Mbps.
I've checked CPU and network usage and neither of those seems to be a
limiting factor. tc
does not show any traffic shaping going on and
I don't know how to troubleshoot this further.
Details
I have a CentOS 6 system running as a NAT router/firewall behind a Comcast cable modem, which is also running as a NAT router
1000 100
eth1 eth0
Internet-------Modem-------------CentOS6-----------------LAN
10.0.0.0/24 192.168.10.0/24
The double NAT is a legacy from the CentOS system having previously served as a router/firewall behind a Time-Warner cable modem that ran in bridge mode. When I moved into Comcast territory I intended to switch the modem to bridge mode but never got around to it, and the double NAT never caused a problem. I was getting 90Mbps throughput with no issues.
In preparing to convert to bridged mode on the Comcast modem I decided to "harden" the CentOS system by disabling some unneeded services and doing "yum update", which I hadn't done in a while. After hardening I did a speed test and was surprised to find throughput down to 30Mbps.
I tried connecting my primary desktop system directly to the modem like this
eth1 eth0
Internet---Modem-------------CentOS6-----------------LAN
| 10.0.0.0/24 192.168.10.0/24
|
+--------------Desktop(Win7)
Running speedtest.net verified that my Comcast connection is capable of 120Mbps, so something I changed on the CentOS system has resulted in capping throughput at 30Mbps. Every time I do a speed test from the LAN (behind the CentOS system) I get a value within 1-2% of 30Mbps, so it almost feels like something is artificially capping throughput.
I thought maybe traffic shaping got enabled somehow, but tc
seems to
indicate it's not active
[jhg@perseus ~]$ sudo tc -s qdisc
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 64159459406 bytes 44745482 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
Sent 2871293442 bytes 26151570 pkt (dropped 0, overlimits 0 requeues 0)
rate 0bit 0pps backlog 0b 0p requeues 0
The "hardening" consisted of
- removing some unneeded packages
- shutting down unneeded services
- setting up iptables to filter all incoming traffic except for one non-standard port for ssh
- installing and configuring tripwire
Removed packages:
redis dovecot
redhat-lsb-compat ipa-client
redhat-lsb nfs-utils-lib
redhat-lsb-printing nfs-utils
foomatic subversion
foomatic-db spamassassin
foomatic-db-ppds certmonger
cups yp-tools
mysql-server ypbind
mysql rpcbind
Currently enabled services:
abrt-ccpp cpuspeed kdump nmb
abrt-oops crond lvm2-monitor ntpd
abrtd dhcpd mcelogd postfix
acpid dkms_autoinstaller mdmonitor rsyslog
atd haldaemon messagebus smb
auditd ip6tables named sshd
autofs iptables netfs sysstat
blk-availability irqbalance network udev-post
My question is: What should I do next to figure out why my CentOS 6 router seems to be artificially capping throughput at 30Mbps?
So, the problem here turned out to be a hardware issue. Things were working fine a month ago, and one does not expect failed hardware to still "work" in a degraded mode, but that's what was happening.
The troubleshooting step that revealed the issue was to actually look at the ethernet port lights on the back of the cable modem. Instead of the green "1Gbps" light it was orange, signifying "100Mbps". In that mode, it appears the modem supports throughput only up to 30Mbps or so.
I know the modem (Arris TG-852G) has GBEthernet ports, so something was preventing the Centos from talking to the modem at 1Gbps. Using
ethtool
I saw this:which essentially said (from the Centos adapter's viewpoint) "I can support GBEthernet, and am advertising GBEthernet, but the peer doesn't support GBEthernet -- so I'm connected at 100Mbps instead".
I tried various fixes suggested in several online fora (including here) such as using a different cable, turning off auto-negotiation, advertising only 1GB speed, or setting the speed to 1GB manually. Turning off auto-neg and trying several different Cat6 cables had no effect, and the other two prevented a connection from being established at all.
I concluded it had to be the adapter itself and ordered a new adapter. When it was installed it immediately connected at 1Gbps. Problem solved.
The moral of the story is, of course, that even though hardware failures in devices without moving parts are rare these days, they're still possible and should be eliminated before blaming the software.
What I would do here is revert the changes individually and run a speed test after each or revert all the changes. Benchmark an unmodified version of CentOS (baseline) and then apply each change individually and run the speed test after each change.