I suspect that one of our server applications has hit its max open file limit.
The application is running in user-space with its own account. The init-script starts a large number of processes which in turn start a number of sub-processes and a large number of threads.
According to the book I set in /etc/security/limits.conf:
USERNAME - nofile 2048
I suspect that the application has hit the limit - by looking at the temporary file directory I found more than 2000 files there.
After raising the limit to 4096 and doing an application restart I found more than 2100 files there.
Now the question: If the application hit the limit 2048 - why was this not logged in /var/log/messages?
syslog-ng is the current syslog-daemon in use.
/etc/syslog-ng/syslog-ng.conf
options { long_hostnames(off); sync(0); perm(0640); stats(3600); };
source src {
internal();
unix-dgram("/dev/log");
unix-dgram("/var/lib/ntp/dev/log");
};
filter f_iptables { facility(kern) and match("IN=") and match("OUT="); };
filter f_console { level(warn) and facility(kern) and not filter(f_iptables)
or level(err) and not facility(authpriv); };
filter f_newsnotice { level(notice) and facility(news); };
filter f_newscrit { level(crit) and facility(news); };
filter f_newserr { level(err) and facility(news); };
filter f_news { facility(news); };
filter f_mailinfo { level(info) and facility(mail); };
filter f_mailwarn { level(warn) and facility(mail); };
filter f_mailerr { level(err, crit) and facility(mail); };
filter f_mail { facility(mail); };
filter f_cron { facility(cron); };
filter f_local { facility(local0, local1, local2, local3,
local4, local5, local6, local7); };
filter f_messages { not facility(news, mail, cron, authpriv, auth) and not filter(f_iptables); };
filter f_warn { level(warn, err, crit) and not filter(f_iptables); };
filter f_alert { level(alert); };
filter f_auth { facility(authpriv, auth); };
destination console { pipe("/dev/tty10" group(tty) perm(0620)); };
log { source(src); filter(f_console); destination(console); };
destination xconsole { pipe("/dev/xconsole" group(tty) perm(0400)); };
log { source(src); filter(f_console); destination(xconsole); };
destination auth { file("/var/log/auth"); };
log { source(src); filter(f_auth); destination(auth); };
destination newscrit { file("/var/log/news/news.crit"); };
log { source(src); filter(f_newscrit); destination(newscrit); };
destination newserr { file("/var/log/news/news.err"); };
log { source(src); filter(f_newserr); destination(newserr); };
destination newsnotice { file("/var/log/news/news.notice"); };
log { source(src); filter(f_newsnotice); destination(newserr); };
destination mailinfo { file("/var/log/mail.info"); };
log { source(src); filter(f_mailinfo); destination(mailinfo); };
destination mailwarn { file("/var/log/mail.warn"); };
log { source(src); filter(f_mailwarn); destination(mailwarn); };
destination mailerr { file("/var/log/mail.err" fsync(yes)); };
log { source(src); filter(f_mailerr); destination(mailerr); };
destination mail { file("/var/log/mail"); };
log { source(src); filter(f_mail); destination(mail); };
destination cron { file("/var/log/cron"); };
log { source(src); filter(f_cron); destination(cron); };
destination localmessages { file("/var/log/localmessages"); };
log { source(src); filter(f_local); destination(localmessages); };
destination messages { file("/var/log/messages"); };
log { source(src); filter(f_messages); destination(messages); };
destination firewall { file("/var/log/firewall"); };
log { source(src); filter(f_iptables); destination(firewall); };
destination warn { file("/var/log/warn" fsync(yes)); };
log { source(src); filter(f_warn); destination(warn); };
You need to actually know if you are running out of files.
Run your process. Then check cat
/proc/<pid>/limits
and see what its limits say.Then, you can get a file descriptor count by running
ls -1 /proc/<pid>/fd | wc -l
.Note that each process has its own limits (children of the parent for example). However, threads distinctly share the file descriptor table of the invoking process and as such do share the file limit between threads and the invoking process.
Whilst you cannot create threads in bash, this program can be used to demonstrate the effect.
This program produces 3 children with 3 threads.
Each child and thread continually creates a new file descriptor every half second plus some random wait.
You can see that from the child processes, each child has an independent file descriptor table.
However threads of these children all share the same count as the child processes.
Also note that child pids can have independent limits. This program also sets a random limit on invocation of each child.
And for added redundant fun, it also sets a random open files limit per thread. But this doesn't stick and is shared between all threads in the process and the child process.
You lack this source definition:
Then you're fine!