Couple of weeks ago I had an issue where I changed DNS addresses in large network of around 300 nodes. After that, some of the nodes still continued to ask old DNS servers, although resolv.conf was ok, and host/nslookup were querying new DNS servers.
Looking at tcpdump and trying to record requests with iptables logging, I confirmed that indeed some of the hosts were still sending queries to old nameservers.
I took one of the hosts out of production and started shutting down services / stracing processes in an attempt to find out the culprit.
At the end - it was lldpd daemon, which obviously cached nameservers at startup and didn't even notice changes in resolv.conf.
So, my question is - is there a more intelligent way to find out which PId is generating specific kind of traffic? I tried with auditctl but without much success. CentOS 6 is in question but if there is solution for any Linux distro, I would appreciate it.
What's wrong with the auditctl?
You would do it like this
1) Define your audit rule to audit sendmsg and sendto system calls. These system calls are used during name resolution.
2) Now search for your audit records. You can grep based on the remote DNS IP here
In the below example you can see that application which was responsible for the systemcall is called dig
And the way to differentiate to which remote DNS request is send is here. So you would just have to grep for a particular DNS host.
Or even better - see what DNS hosts are used (I have only one in this example)
And then narrow down which apps are using those particular hosts.
Edit 1: Actually I just did strace of a simple ping to a host. Seems like sendmsg is not always used. Here is what I see
My previous example was based on dig app, which takes slightly different route in terms of system calls.
So it looks like in majority of cases it would be this rule
Followed by ausearch
I wrestled with the very same problem a few days ago, and came up with a very simple method. It is based on the fact that the sending process will be waiting for a DNS response to come, on the same port it sent the request from:
iptables -j LOG
lsof -i UDP:<source_port>
to find out which process is waiting for response on that port.Of course, as the response arrives within milliseconds, you can't do that manually; moreover, even when automated, there's no guarantee that you will be able to query the system before the DNS response arrives, and the sending process dies. That is why before even executing the above steps, i also configure the kernel Traffic Controller to delay outgoing packets directed to a specific ip/port (using the
tc
modulenetem
). This allows me to control the time window i have to query the system about which PID is waiting for the DNS response, on the source UDP port obtained in step 1.I have automated the above steps, including the
tc
delay, in a small script called ptrap (which is a more general solution, not limited to DNS requests, thus eligible for detection of processes using any TCP/UDP based protocol). With its aid i found out that, in my case, the service contacting the old DNS server was sendmail.There is
atop
. There is a kernel module (netatop
) and daemon which will makeatop
track network usage by process.You should first install
atop
Here is how you install the kernel module. This is valid when the post was written but it can become outdated:
If you have systemd, create the service file
netatopd.service
file in/etc/systemd/system/
. It would contain:Now you can enable the daemon:
To see live per-process network usage:
To see top 3 network-intensive throughout the day:
man atopsar
for more options.There are many options to
netstat
that show combinations of listening/open sockets over tcp/udp/both. Something like:...would have given you a lot of output, but included the source, destination, port numbers, and PID of the process owning those ports.
+1 for Dmitry's answer above; that worked nicely for me:
To see the resulting entries, I grep the log file for that "-k" string
To get just the interesting fields,
(explanation: cut -d' ' -f 4- -> chop the line into fields using space (-d' ') as delimiter, show fields fourth to last ( 4- ) )
(explanation: sed "s|^|@\n|g;s| |\n|g" -> edit line, prepend '@' char-plus-newline to start of line, change spaces to newlines)
(explanation: grep -E "^((uid|comm|exe)=|@)" -> as each field of the original line is now on it's own line, pick out the interesting fields: user-id, command, executable - and the line-start '@' char.)
(explanation: tr '\n@' ' \n' -> now having only the wanted fields, turn the newlines back into spaces, and the prepended '@' back into a newline (which rejoins the fields into one line)
(explanation: sort -u -> sort lines, show only unique lines)
gives me:
Commands containing spaces are encoded in simple ascii-to-hex method (see audit_logging.c ). To decode, replace "FF" with "ÿ" and recode that from html to ascii :
(explanation: sed "s|^[^=]=||g;s| [^ ]=| |g" -> edit away the 'xxx=' part of the lines - first: line-start (^) followed by any-char-except-'=' is replaced with blank, then space followed by any-char-except-' ' replaced with space)
(explanation: while read U C E ; do ... done -> loop over each line, reading in each of out three bits of data into U,C,E (userid, command, executable))
(explanation: echo "$C" | grep -q '"' || -> test the command field to see if it contains a doublequote - if not ('||') then do the following: )
(explanation: { C=\"
echo $C | sed "s|\(..\)|\&#x\1;|g" | recode h4..ascii
\" ; } -> print the command string, edit each pair of chars 'FF' to be 'ÿ', then pass through gnu 'recode' to turn them from html entities into ascii chars.)(explanation: echo "uid=$U comm=$C exe=$E" -> print out the modified line)
This gives me output (just showing the decoded line):
/ j
lsof
would be an appropriate tool to monitor a specific port and determine the PID that's generating traffic on it. For example here I'm monitoring on a server the DNS/domain TCP port 53, so that I can determine which PID is causing the DNS lookup:Now if I were to send some
curl
traffic to the DNS server:we'd see this type of output from the above command:
How it works
The above command that watches for port 53 traffic works by putting
lsof
into a repeating loop that runs every 1 sec,-r 1
. We then telllsof
to report on only traffic that uses port 53,iTCP:53
. The-Pn
instructslsof
to display hostnames and ports as numbers, and not actual names.We then use a
grep
to read the output coming fromlsof
and filter the output so that we only see:53
port traffic.The PID of the process that's sending the traffic is in the output being shown by
lsof
as well. The 2nd column shows the PID, 4953.