My server has 24 CPU cores, 96G memory, installed CentOS 7.2 x86_64.
After starting my program with a large data set, my program will use about 50G memory, and the Linux system will show a high rate of system interrupts, but context switching rate will be low. dstat
will show somewhere between 500k int/s and 1000k int/s. CPU usage will be close to 100%, about 40% us, 60% sy.
If the data set is small, the program will use about 5G memory, and everything will be fine, CPU usage 100%, about 99%us, 1% sy. It's expected.
The program is written by myself, it's a multi-thread program. It doesn't do any network IO, very little disk IO, mostly memory operations and arithmetic. The thread model and the algorithm are the same regardless of the data set size.
My question is, how can I find out exactly which interrupts are used the most by my program (and get rid of them to improve performance if possible) ?
I assume you don't have single socket system with 24C CPU. So it's probably NUMA system with 2x12C. In that case I'd suggest to make sure the program uses only one numa node (usually socket) and it's local half of RAM.
When you have 50G used, that means, numa locality can't be assured as it's more than half of the memory.
For checking of actual state, use numastat. If you're on the RHEL, you may use numad to handle memory locality automatically. Or you may use
numactl --hardware
will give you overview about your HW NUMA nodes. There is quite nice howto with examples:http://fibrevillage.com/sysadmin/534-numactl-installation-and-examples
That way you may lock your program on desired CPUs.
And I'd suggest to check if you have irqbalance daemon running, otherwise you may have one core overloaded with interrupts.
On Linux:
watch cat /proc/interrupts
will show you the amount of interrupt calls per interrupt and CPU. I guess in your case you'll see LOC (local timer) and RES (rescheduling).