I have a java process (Glassfish) which is leaking file descriptors. I know this because I get the helpful java.io.IOException: Too many open files
exception. I can look in /proc/PID#/fd
and see all the open file descriptors. When I use lsof I get a very large number of entries like this:
java 18510 root 8811u sock 0,4 1576079 can't identify protocol
java 18510 root 8812u sock 0,4 1576111 can't identify protocol
java 18510 root 8813u sock 0,4 1576150 can't identify protocol
I see 12 new ones created per minute. What options can I use on lsof or what other tools are available to me to help track down socket file descriptors where the protocol can't be identified?
to see the top 20 file handle using processes:
the output is in the format file handle count, pid, cmndline for process
example output
Become familiar with the strace command. It monitors system calls. I recently used it to track down file descriptor leaks that were causing our snmpd daemon to crash repeatedly. It takes some getting used to, but it's a powerful tool.
You can use strace to attach to a running process (don't forget the -f flag to follow child processes).
What exactly are you trying to track down? The remote IP address(es) associated with the leaked FDs, the defective code, or something else?
As you've already identified that there is a leak, contacting the engineers responsible for this java process seems like a reasonable next step.