Ping a Specific Port

Question

Kyle Brandt

Asked: 2010-07-06 07:43:38 +0800 CST2010-07-06 07:43:38 +0800 CST 2010-07-06 07:43:38 +0800 CST

Is there a way to get Cache Hit/Miss ratios for block devices in Linux?

772

Is it possible to see in Linux how many read and write requests from user space end up causing cache hits and misses for block devices?

5 Answers

Voted

famzah · Answer 1 · 2010-07-06T22:44:54+08:00

You can develop your own SystemTap script. You need to account the following two subsystems:

VFS: this represents all I/O requests before the Buffer cache (i.e. absolutely every I/O request); review the "vfs.read", "vfs.write" and the "kernel.function("vfs_*")" probes; you need to filter out the block devices you want to monitor by their respective major+minor numbers.
Block: this represents all I/O requests sent to the block devices before the I/O scheduler (which also does merge + reorder of the I/O requests); here we know which requests were missed by the Buffer cache; review the "ioblock.request" probe.

SystemTap development takes some time to learn. If you are a moderate developer and have good knowledge in Linux, you should be done in 3-4 days. Yes, it takes time to learn, but you'll be very happy with the results - SystemTap gives you the opportunity to (safely) put probes in almost any place in the Linux kernel.

Note that your kernel must have support for loading and unloading of kernel modules. Most stock kernels nowadays support this. You'll also need to install the debug symbols for your kernel. For my Ubuntu system, this was as easy as downloading a several hundred MB .deb file, which the Ubuntu kernel development team compiled for me. This is explained at the SystemtapOnUbuntu Wiki page, for example.

P.S. Take the SystemTap approach only if you have no other solution, because it's a totally new framework which you have to learn, and that costs time/money and sometimes frustration.

Dave Wright · Answer 2 · 2011-04-29T06:39:45+08:00

I went ahead and wrote an stap script for this. There is one on the systemtap wiki, but it doesn't appear to be correct. In basic testing, this appears pretty accurate but YMMV.

#! /usr/bin/env stap
global total_bytes, disk_bytes, counter

probe vfs.read.return {
  if (bytes_read>0) {
    if (devname=="N/A") {
    } else {
      total_bytes += bytes_read
    }
  }
}
probe ioblock.request
{
    if (rw == 0 && size > 0)
    {
        if (devname=="N/A") { 
        } else {
          disk_bytes += size
        }
    }

}

# print VFS hits and misses every 5 second, plus the hit rate in %
probe timer.s(5) {
    if (counter%15 == 0) {
        printf ("\n%18s %18s %10s %10s\n", 
            "Cache Reads (KB)", "Disk Reads (KB)", "Miss Rate", "Hit Rate")
    }
    cache_bytes = total_bytes - disk_bytes
    if (cache_bytes < 0)
      cache_bytes = 0
    counter++
    hitrate =  10000 * cache_bytes / (cache_bytes+disk_bytes)
    missrate = 10000 * disk_bytes / (cache_bytes+disk_bytes)
    printf ("%18d %18d %6d.%02d%% %6d.%02d%%\n",
        cache_bytes/1024, disk_bytes/1024,
        missrate/100, missrate%100, hitrate/100, hitrate%100)
    total_bytes = 0
    disk_bytes = 0
}

BMDan · Answer 3 · 2010-07-06T08:52:21+08:00

BMDan

2010-07-06T08:52:21+08:002010-07-06T08:52:21+08:00

/proc/slabinfo is a good start, but doesn't give you quite the information you're looking for (don't be fooled by the hit/miss percentages on systems with multiple cores and stats enabled; those are something else). As far as I know, there's not a way to pull that particular information out of the kernel, though it shouldn't be terribly difficult to write up a bit of code to do.

Edit: http://www.kernel.org/doc/man-pages/online/pages/man5/slabinfo.5.html

2

lemonsqueeze · Answer 4 · 2016-04-25T04:18:44+08:00

Now there's the cachestat utility from perf-tools package.

The author also lists some (possibly cruder) alternatives people use:

A) Study the page cache miss rate by using iostat(1) to monitor disk reads, and assume these are cache misses, and not, for example, O_DIRECT. The miss rate is usually a more important metric than the ratio anyway, since misses are proportional to application pain. Also use free(1) to see the cache sizes.

B) Drop the page cache (echo 1 > /proc/sys/vm/drop_caches), and measure how much performance gets worse! I love the use of a negative experiment, but this is of course a painful way to shed some light on cache usage.

C) Use sar(1) and study minor and major faults. I don't think this works (eg, regular I/O).

D) Use the cache-hit-rate.stp SystemTap script, which is number two in an Internet search for Linux page cache hit ratio. It instruments cache access high in the stack, in the VFS interface, so that reads to any file system or storage device can be seen. Cache misses are measured via their disk I/O. This also misses some workload types (some are mentioned in "Lessons" on that page), and calls ratios "rates".

shodanshok · Answer 5 · 2016-04-25T07:45:40+08:00

shodanshok

2016-04-25T07:45:40+08:002016-04-25T07:45:40+08:00

If you are interested in the IO hit/miss ratio of a specific process, a simple but very effective approach is to read the /proc/<pid>/io file.

Here you will find 4 key values:

rchar: the total number of read bytes from the application point of view (ie: with no difference made between read satisfied from the physical storage rather than from the cache)
wchar: as above, but about written bytes
read_bytes: the bytes really read from the storage subsystem
write_bytes: the bytes really written to the storage subsystem

Say a process has the following values:

rchar: 1000000
read_bytes: 200000

The read cache miss ratio (in bytes) is 100*200000/1000000 = 20%, and the hit ratio is 100-20 = 80%

There is a catch, however: the rchar value includes thing as tty IO, so for processes that read/write a lot from/to a pipe the calculation above will be skewed, reporting higher hit ratio than the effective one.

2

Is there a way to get Cache Hit/Miss ratios for block devices in Linux?

Ping a Specific Port

How do I tell Git for Windows where to find my private RSA key?

How do you restart php-fpm?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Resolve host name from IP address

How can I sort du -h output by size

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?