Ping a Specific Port

Question

Alexander Gladysh

Asked: 2011-04-08 03:50:30 +0800 CST2011-04-08 03:50:30 +0800 CST 2011-04-08 03:50:30 +0800 CST

How to debug troubles with unix domain sockets?

772

Ubuntu Server 10.04.2

$ uname -a
Linux my.local 2.6.32-30-generic-pae #59-Ubuntu SMP 
Tue Mar 1 23:01:33 UTC 2011 i686 GNU/Linux

It seems that my domain socket queue is overflowing, but I can't prove it.

I've got this stack nginx->[spawn-fcgi->multiwatch->]custom-fcgi-service

Nginx is communicating with custom-fcgi-service by the means of unix domain socket.

Today we've got slight increase in traffic, and suddenly my nginx error.log is full of eels:

2011/04/07 15:31:51 [error] 28187#0: *469350 connect() to unix:/tmp/my.socket 
failed (11: Resource temporarily unavailable) while connecting to upstream, 
client: [IP witheld], server: my.local, request: "GET /myurl HTTP/1.0", 
upstream: "fastcgi://unix:/tmp/my.socket:", host: "example.com"

Some requests make it through, but many return 5xx error.

If I restart custom-fcgi-service, error goes away, but soon enough reappears. After inspecting custom-fcgi-service status, I'm reasonably sure that it works OK (though may be too slow for this amount of traffic, but that is a mere hypothesis).

I've tried doing this:

echo 65535 > /proc/sys/net/unix/max_dgram_qlen

But it did not help much. (Not sure if time-to-error became longer, may be, but not enough to fix it.)

If I increase number of worker forks of custom-fcgi-service, error does not appear for longer time, but so far I was not able to increase number of workers high enough to fix it for ever. CPU and memory and IO load on that machine are well within limits, so, again, I think that custom-fcgi-service is just being slow on some subsequent network call.

Question is: how to debug this issue? And if it is indeed socket queue length, how to make a sensor that will warn us that we need to fork more custom-fcgi-service workers?

2 Answers

Voted

ghisguth · Answer 1 · 2011-04-08T04:30:34+08:00

ghisguth

2011-04-08T04:30:34+08:002011-04-08T04:30:34+08:00

It seems like you have problem with connect, not with send. Try to increase kernel receiver backlog:

echo "2000" > /proc/sys/net/core/netdev_max_backlog

or

sysctl –w sys.net.core.netdev_max_backlog=2000

Have you checked system logs (e.g. dmesg)?

3

Rocky · Answer 2 · 2016-09-11T23:32:52+08:00

Rocky

2016-09-11T23:32:52+08:002016-09-11T23:32:52+08:00

try to change spawn's configuration file, backlog: 4096.

-3

How to debug troubles with unix domain sockets?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?