Ping a Specific Port

Question

PP.

Asked: 2013-08-03 04:35:25 +0800 CST2013-08-03 04:35:25 +0800 CST 2013-08-03 04:35:25 +0800 CST

Linux poll() with 1 fd slow after a write, fast after a read

772

Problem

I am observing that poll() calls are taking milliseconds when immediately called after a write() operation, yet taking only microseconds when immediately called after a read() operation.

This is in code that uses the ACE C++ communication library.

I'm trying to understand why there is such a significant difference in the system CPU time taken to call poll() after a write() call compared to a read() call.

System

[root@host ~]# lsb_release -a
LSB Version:    :core-4.0-amd64:core-4.0-ia32:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-ia32:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-ia32:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description:    Red Hat Enterprise Linux Server release 5.9 (Tikanga)
Release:    5.9
Codename:   Tikanga

strace

write(234, "\4\0S\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\3\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 256) = 256 <0.000048>
poll([{fd=234, events=POLLIN}], 1, -1)  = 1 ([{fd=234, revents=POLLIN}]) <0.006036>
read(234, "\4\0S\0\0\0\0\0\1\0\0\0
write(234, "\4\0S\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\3\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 256) = 256 <0.000192>
poll([{fd=234, events=POLLIN}], 1, -1)  = 1 ([{fd=234, revents=POLLIN}]) <0.004996>
read(234, "\4\20S\0\0\0\0\0\1\0\0\0\200\2\0\0\0\0\3\0\1\0\0\0\0\10\1\0\200\2\0\0"..., 256) = 256 <0.000076>
poll([{fd=234, events=POLLIN}], 1, -1)  = 1 ([{fd=234, revents=POLLIN}]) <0.000022>
read(234, "\2\0\0\0\241\352\1\0\0\0\0\0\25\0\0\0\7\0\0\0\242\1\0\0\25\0\0\0\251\1\0\0"..., 640) = 640 <0.000337>
write(234, "\4\0S\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\3\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 256) = 256 <0.000096>
poll([{fd=234, events=POLLIN}], 1, -1)  = 1 ([{fd=234, revents=POLLIN}]) <0.004102>
read(234, "\4\20S\0\0\0\0\0\1\0\0\0!\1\0\0\0\0\3\0\1\0\0\0\0\10\0\0!\1\0\0"..., 256) = 256 <0.000037>
poll([{fd=234, events=POLLIN}], 1, -1)  = 1 ([{fd=234, revents=POLLIN}]) <0.000157>
read(234, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\t\0\0\0\10\1\0\0\0\0\0\0\0\0\0\0"..., 289) = 289 <0.000047>
write(234, "\4\0S\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\3\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 256) = 256 <0.000036>
poll([{fd=234, events=POLLIN}], 1, -1)  = 1 ([{fd=234, revents=POLLIN}]) <0.001882>

Discussion

Even though the ACE library has an option to use it there is probably little benefit in moving to epoll() because poll() is only being called with one file descriptor in this case.

As the system becomes loaded with 600 odd threads this poll() call is consuming around 98% of the system call CPU time of the process involved. So I'm looking for ways to make this more efficient. It does appear, however, that calling poll() after a write() is the issue as this is consistently in the order of 100x slower than after a read() call.

2 Answers

Voted

Soham Chakraborty · Answer 1 · 2013-08-16T09:37:20+08:00

Soham Chakraborty

2013-08-16T09:37:20+08:002013-08-16T09:37:20+08:00

The difference between the time between one syscall and another is simply the difference of time between the precedent and the next one. It is simply a measurement purpose timestamp so that users will know how much time a syscall takes after another and to find out whether the app is hanging on one particular syscall.

However I am a bit befuddled to see that reads are quicker here. Simply because writes are often async but reads tend to be in straight line and if we are hopping across different filesystem blocks, disk seek can make it slow. But see the revents (requested events - third argument in poll() call) is saying POLLIN meaning data to be read. Perhaps that might explain the apparent slowness. Albeit, I will need the entire strace to be sure.

1

PP. · Answer 2 · 2013-08-16T08:30:53+08:00

PP.

2013-08-16T08:30:53+08:002013-08-16T08:30:53+08:00

I believe I may be having issues believing what strace tells me. Looking at the source code of strace we can see that when the system call is about to be entered the current time is captured, and the current time is captured when the system call has exited.

In other words strace is measuring the wallclock time between entering and exiting the system call. Which means that it isn't measuring how much CPU time has been consumed by the system call. Blocking calls, then, can appear to consume most of the system time by the application - however this is wallclock time and has no correlation with the amount of system CPU used.

0

Linux poll() with 1 fd slow after a write, fast after a read

Problem

System

strace

Discussion

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?