I am running an nginx server that acts as a proxy to an upstream unix socket, like this:
upstream app_server {
server unix:/tmp/app.sock fail_timeout=0;
}
server {
listen ###.###.###.###;
server_name whatever.server;
root /web/root;
try_files $uri @app;
location @app {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_pass http://app_server;
}
}
Some app server processes, in turn, pull requests off /tmp/app.sock
as they become available. The particular app server in use here is Unicorn, but I don't think that's relevant to this question.
The issue is, it just seems that past a certain amount of load, nginx can't get requests through the socket at a fast enough rate. It doesn't matter how many app server processes I set up.
I'm getting a flood of these messages in the nginx error log:
connect() to unix:/tmp/app.sock failed (11: Resource temporarily unavailable) while connecting to upstream
Many requests result in status code 502, and those that don't take a long time to complete. The nginx write queue stat hovers around 1000.
Anyway, I feel like I'm missing something obvious here, because this particular configuration of nginx and app server is pretty common, especially with Unicorn (it's the recommended method in fact). Are there any linux kernel options that needs to be set, or something in nginx? Any ideas about how to increase the throughput to the upstream socket? Something that I'm clearly doing wrong?
Additional information on the environment:
$ uname -a
Linux servername 2.6.35-32-server #67-Ubuntu SMP Mon Mar 5 21:13:25 UTC 2012 x86_64 GNU/Linux
$ ruby -v
ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux]
$ unicorn -v
unicorn v4.3.1
$ nginx -V
nginx version: nginx/1.2.1
built by gcc 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
TLS SNI support enabled
Current kernel tweaks:
net.core.rmem_default = 65536
net.core.wmem_default = 65536
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_mem = 16777216 16777216 16777216
net.ipv4.tcp_window_scaling = 1
net.ipv4.route.flush = 1
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_moderate_rcvbuf = 1
net.core.somaxconn = 8192
net.netfilter.nf_conntrack_max = 524288
Ulimit settings for the nginx user:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) 16382
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
It sounds like the bottleneck is the app powering the socket rather than it being Nginx itself. We see this a lot with PHP when used with sockets versus a TCP/IP connection. In our case, PHP bottlenecks much earlier than Nginx ever would though.
Have you checked over the sysctl.conf connection tracking limit, socket backlog limit
net.core.somaxconn
net.core.netdev_max_backlog
tl;dr
listen("/var/www/unicorn.sock", backlog: 1024)
worker_connections 10000;
Discussion
We had the same problem - a Rails app served by Unicorn behind a NGINX reverse proxy.
We were getting lines like these in Nginx error log:
Reading the other answers we also figured that maybe Unicorn is to blame, so we increased it's backlog, but this did not resolve the problem. Monitoring server processes it was obvious that Unicorn was not getting the requests to work on, so NGINX appeared to be the bottleneck.
Searching for NGINX settings to tweak in
nginx.conf
this performance tuning article pointed out several settings that could impact how many parallel requests NGINX can process, especially:You might try looking at
unix_dgram_qlen
, see proc docs. Although this may compound the problem by pointing more in the queue? You'll have to look (netstat -x...)I solved by increasing the backlog number in the config/unicorn.rb... I used to have a backlog of 64.
and I was getting this error:
Now, I increased to 1024 and I don't get the error:
backlog default value is 1024 in unicorn config.
http://unicorn.bogomips.org/Unicorn/Configurator.html
1024 client is unix domain socket limit.