We are occasionally, but not consistently, experiencing a weird network stack issue. Rebooting the server in question clears it up.
It happens as follows (gleaned through tcpdump
on the server):
HTTP client starts sending request to Nginx.
Server responds normally, acking every packet it gets.
On the final client send, the packet never reaches the receiving socket on the server.
The client resends the packet several times, then the server finally times out and disconnects.
Also, strace
of Nginx confirms that the data is not reaching Nginx.
Here is an edited version of the tcpdump
output. I have simplified the exchange and anonymized some details.
Turning on iptables logging shows some packets being blocked, which may be relevant:
IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=39670 DPT=80 WINDOW=0 RES=0x00 RST URGP=0
IN= OUT=eth0 SRC=server DST=client LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=80 DPT=39669 WINDOW=31 RES=0x00 ACK URGP=0
However, our iptables setup is pedestrian. We block everything except RELATED,ESTABLISHED
, and we allow the port in question, 80. I don't see why iptables is blocking this, unless the packets are somehow falling outside the states of RELATED
and ESTABLISHED
.
I have also included our sysctl
settings in the above gist. Anything else I can look at?
Linux 3.8.0 on Ubuntu 12.04.3, on DigitalOcean.
Edit 3: Disabled iptables, same problem, so it's not caused by bad iptables rules.
Edit 2: Above I show iptables blocking RST
packets, but more importantly it's blocking a lot of ACK
s. I just picked a random log entry, ACK
seems more common.
Edit 1: I added iptables tracing. This seems to the part that drops a packet (though, again, not sure if this is related to my problem):
TRACE: raw:OUTPUT:rule:2 IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 SEQ=2118637628 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
TRACE: raw:OUTPUT:policy:3 IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 SEQ=2118637628 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
TRACE: filter:OUTPUT:rule:3 IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 SEQ=2118637628 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
TRACE: filter:block:rule:1 IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 SEQ=2118637628 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
TRACE: filter:logging:rule:1 IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 SEQ=2118637628 ACK=0 WINDOW=0 RES=0x00 RST URGP=0
iptables: reject: IN= OUT=lo SRC=client DST=server LEN=40 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=TCP SPT=41572 DPT=8001 WINDOW=0 RES=0x00 RST URGP=0
No idea why lo
is involved here. Server is accepting traffic on eth0
.