We are occasionally, but not consistently, experiencing a weird network stack issue. Rebooting the server in question clears it up.
It happens as follows (gleaned through tcpdump
on the server):
HTTP client starts sending request to Nginx.
Server responds normally, acking every packet it gets.
On the final client send, the packet never reaches the receiving socket on the server.
The client resends the packet several times, then the server finally times out and disconnects.
Also, strace
of Nginx confirms that the data is not reaching Nginx.
Here is an edited version of the tcpdump
output. I have simplified the exchange and anonymized some details.
Turning on iptables logging shows some packets being blocked, which may be relevant:
IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=39670 DPT=80 WINDOW=0 RES=0x00 RST URGP=0
IN= OUT=eth0 SRC=server DST=client LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=80 DPT=39669 WINDOW=31 RES=0x00 ACK URGP=0
However, our iptables setup is pedestrian. We block everything except RELATED,ESTABLISHED
, and we allow the port in question, 80. I don't see why iptables is blocking this, unless the packets are somehow falling outside the states of RELATED
and ESTABLISHED
.
I have also included our sysctl
settings in the above gist. Anything else I can look at?
Linux 3.8.0 on Ubuntu 12.04.3, on DigitalOcean.
Edit 3: Disabled iptables, same problem, so it's not caused by bad iptables rules.
Edit 2: Above I show iptables blocking RST
packets, but more importantly it's blocking a lot of ACK
s. I just picked a random log entry, ACK
seems more common.
Edit 1: I added iptables tracing. This seems to the part that drops a packet (though, again, not sure if this is related to my problem):
TRACE: raw:OUTPUT:rule:2 IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 SEQ=2118637628 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
TRACE: raw:OUTPUT:policy:3 IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 SEQ=2118637628 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
TRACE: filter:OUTPUT:rule:3 IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 SEQ=2118637628 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
TRACE: filter:block:rule:1 IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 SEQ=2118637628 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
TRACE: filter:logging:rule:1 IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 SEQ=2118637628 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
iptables: reject: IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 WINDOW=0 RES=0x00 RST URGP=0
No idea why lo
is involved here. Server is accepting traffic on eth0
.
Your log definitely shows communication happening on the
lo
interface.iptables
by changingINPUT
table default policy toACCEPT
and deactivate anyREJECT
orDROP
rule which might stand in the wayI would bet $1000 on the fact your filtering rules accepting traffic are bound to some
eth0
interface, thus rejecting traffic incoming onlo
.I would pay attention to where the testing client in relation to the server. If you are running your test client on the same machine, it most probably either uses the
127.0.0.1
IP address or thelocalhost
domain name which usually resolves to the same IP address.That will send traffic on the special loopback interface (
lo
) and not on theeth0
one.Unless you bound nginx to a specific interface by asking it to listen on its IP address, nginx will listen by default on
0.0.0.0
, which is every interface. Thus, you won't notice if it accepts connections onlo
or not. You could try to force nginx listening on youreth0
IP address just to be sure.When locally testing your server, ensure you use your one of your external interfaces (
eth[0..]
) IP address, or a domain name resolving to one of them.