I'm currently checking out the limits of a server application on a system with two 12-core processors, i.e. 24 cores. (Hyper-threading currently disabled).
I'm running 12 UDP receive-heavy instances of that same app on that box, and it maxes out all 12 cores of one of the 2 sockets, according to Task Manager. But the other socket is idle. I also tried starting a 13th instance of that application, but it still gets run on the first, already busy socket.
Why? What are the rules by which threads get scheduled to cores in a multi-processor system?
Some ideas:
- Could it have to do with RSS (Receive side scaling), which is enabled and would be involved in selecting cores at least for the receiver thread of our app (but the app has other work-heavy threads, too)?
- Is RSS (or MSI/MSI-X) limited to one socket?
- Could/will a single app instance run on multiple sockets at the same time?
System details:
- ProLiant BL460c Gen9
- Intel Xeon E5-2670 v3
- Windows 2012 R2
Some slots / IO devices communicate directly with one of the processors (other slots may communicate with the other one). There are advantages to doing the work generated by that card / IO device on that processor.
For example DDIO works best for the processor attached to the PCI slot(s) / IO device.