Ping a Specific Port

Question

andyhky

Asked: 2011-01-28 10:53:29 +0800 CST2011-01-28 10:53:29 +0800 CST 2011-01-28 10:53:29 +0800 CST

Linux Performance Question

772

Some of our application owners are saying several processes are taking double the time to run that they should.

This one has our head scratching.

We cannot understand why some operations are taking double the time on Server 1 than they take on Server 2.

Server 1: IBM x3850 M2 (RHEL 4 Nahant Update 8)

Server 1 is mostly idle from an IO standpoint. S1 and S2 are both on SAS drives in Raid 5. Server 1 has 4 drives, Server 2 has 4 drives. Iostat output from server 1

Linux [hostname-removed] 2.6.9-89.ELsmp #1 SMP Mon Apr 20 10:34:33 EDT 2009 i686 i686 i386 GNU/Linux

Output of /proc/cpuinfo

Output of /proc/meminfo

Server 2: IBM x3650 (RHEL 4 Nahant Update 8)

Server 2 is the more active of the two servers. The iostat output looks like there are a ton of devices attached because of SAN multipathing. The dd operation and tar operation done were on local storage. Iostat output from server 2

Linux [hostname-removed] 2.6.9-78.0.13.ELsmp #1 SMP Wed Jan 7 17:52:47 EST 2009 i686 i686 i386 GNU/Linux

Output of /proc/cpuinfo

Output of /proc/meminfo

As expected, the operation of writing a 1GB file is quicker on Server 1

[server1]$ time dd if=/dev/zero of=bigfile bs=1024 count=1048576
1048576+0 records in
1048576+0 records out

real    0m15.032s
user    0m0.961s
sys     0m11.389s

Versus Server 2, this seems to check out:

[server2]$ time dd if=/dev/zero of=bigfile bs=1024 count=1048576
1048576+0 records in
1048576+0 records out

real    0m27.519s
user    0m0.531s
sys     0m8.612s

However, tarballing that same file on Server 1 takes twice as long on the 'user' time and a bit longer on real time.

 [server1]$ time tar -czf server1.tgz bigfile

real    0m27.696s
user    0m20.977s
sys     0m5.294s

 [server2]$ time tar -czf server2.tgz bigfile

real    0m23.300s
user    0m10.378s
sys     0m3.603s

2 Answers

Voted

gelraen · Answer 1 · 2011-01-28T11:03:56+08:00

gelraen

2011-01-28T11:03:56+08:002011-01-28T11:03:56+08:00

Massive I/O operations performance much more depends on HDD speed and current I/O load, rather than CPU.

1

mark seger · Answer 2 · 2011-01-29T04:01:06+08:00

These are exactly the kind of problems a tool like collectl is ideal for addressing. Producing the time it takes for dd or tar to run is a good start, but what is happening in between? Are your i/o rates steady or are they hitting valleys and stalls? There are all kinds of things that can go wrong from start to finish.

Since you have a system with a known 'good' performance profile you're in the best position to actually solve this problem. Run your tests along with collectl and watch your cpu, memory, network and disks (all on the same line making it real easy to see trends over time). You can also look at things like nfs, tcp, sockets, and several other things but I suspect this doesn't apply to this case.

Now repeat the test on the box knowing to have poor performance and see what is different. The answer WILL be there. Could be starved memory, too many interrupts on the cpu (collectl can show you this too), or large i/o wait times. Whatever it is collectl can identify it for you, but then you have to figure out what is the root cause. Could be a highly fragments or even bad disk. Maybe there's something wrong with a controller. That part is for you to figure out.

Hope this helps...

-mark

Linux Performance Question

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?