Ping a Specific Port

Question

ehsanul

Asked: 2010-11-20 15:39:28 +0800 CST2010-11-20 15:39:28 +0800 CST 2010-11-20 15:39:28 +0800 CST

cp and cat on Centos 5.5/ext3 is 10x slower for files in certain directories

772

I was sorting some large files (91GB across 27 files) with GNU sort when I noticed that iostat -dxk 3 showed very slow read speeds, between 5 MB/s and 10 MB/s, with 100% disk utilization. I tried cat large-file > /dev/null, and got similar performance, only slightly higher. The same for cp large-file /tmp/, with /tmp on a separate disk. vim experiences the same, as well as scripts I write in Ruby reading files, if that helps. Write speed is fine and fast though.

EDIT: It looks like these operations are only slow on files in a certain directory. The same operations on other files in a sibling directory (same disk partition), end up being fast, with above 90 MBPS read speed. This makes no sense to me. Could it possibly be due to the manner in which these files were constructed? I created them by reading in a lot of other files, and writing each line into an appropriate "bucket file", depending on the first character in the line (so a-z, and a single file for others). So I was pretty much simultaneously appending lines to 27 files, one at a time, through 8 processes while reading a couple thousand files. Could this cause the sequential order the blocks representing a file to be out of order instead? Hence the slow sequential reads afterwards?

However, I tried using fio to measure sequential read performance, and it clocked in at 73 MB/s. Also notable is that my boss got proper read speeds when downloading some files via FTP from the same machine.

So I'm guessing this is some configuration issue somewhere, but I have no idea where. What could the reason be and how can I try to fix it?

Edit: This machine is running under Citrix Xen virtualization.

Edit: Output of iostat -dxk while sort is loading a large file into its buffer (get similar output for cat/cp):

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb              0.00     0.00 1000.00  0.00  6138.61     0.00    12.28    24.66   24.10   0.99  99.41
xvdb1             0.00     0.00 1000.00  0.00  6138.61     0.00    12.28    24.66   24.10   0.99  99.41
xvda              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
xvda1             0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Edit: Further performance degradation after a few hours (with breaks for the disk when sort was processing). It almost looks like random IO, but there's only a single sort operation going on, with no other processes doing any IO, so reads should be sequential =/ :

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb              0.00     0.00 638.00  0.00  2966.67     0.00     9.30    25.89   40.62   1.57 100.00

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb              0.33     0.00 574.67  0.00  2613.33     0.00     9.10    27.82   47.55   1.74 100.00 

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb              0.00     0.00 444.33  0.00  1801.33     0.00     8.11    28.41   65.27   2.25 100.00

2 Answers

Voted

mattdm · Answer 1 · 2010-11-21T14:38:12+08:00

Best Answer

mattdm

2010-11-21T14:38:12+08:002010-11-21T14:38:12+08:00

Are your slow files highly fragmented? Run /usr/sbin/filefrag -vfilename to find out. You'll get output like:

Checking big.file
Filesystem type is: ef53
Filesystem cylinder groups is approximately 4272
Blocksize of file big.file is 4096
File size of big.file is 96780584 (23629 blocks)
First block: 88179714
Last block: 88261773
Discontinuity: Block 6 is at 88179742 (was 88179719)
Discontinuity: Block 12233 is at 88192008 (was 88191981)
Discontinuity: Block 17132 is at 88197127 (was 88196911)
Discontinuity: Block 17133 is at 88255271 (was 88197127)
big.file: 5 extents found, perfection would be 1 extent

or perhaps much worse.

You mention that the system is running under virtualization. Is this access with a virtual disk image file?

3

Arenstar · Answer 2 · 2010-11-20T21:33:32+08:00

Arenstar

2010-11-20T21:33:32+08:002010-11-20T21:33:32+08:00

Soo, i believe this is a simple case of not enough RAM...

For a start when reading/writing a file smaller than your RAM ( every thing is fast - like your "fio" test..)

Once you start working with data larger than your OS can cache, your OS cache begins to swap ( sometimes even to disk) ( in fact you should check your ram usage when you have 4mb's read speed)

It sounds like something ive experience before, your getting slow speed for reading such a large file.. ( ive seen a DB do exactly the same when it was using large index's that didnt fit into RAM )

Given also the overhead of working on a VM ( this sounds very typical to me )

I would check that your disks arent swapping or your active memory isnt full ( and let me know ) :D

0

cp and cat on Centos 5.5/ext3 is 10x slower for files in certain directories

Ping a Specific Port

How do I tell Git for Windows where to find my private RSA key?

How do you restart php-fpm?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

How can I sort du -h output by size

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?