We are purchasing a server to run various Bioinformatics software packages. The main package on our test machine is multi threaded and is totally CPU bound, the IO & RAM are not limiting performance - the CPU runs at 100% whereas the RAM & disk IO are at minimal levels.
We want to ensure that we get the best processor(s) for our workload, but given the rather large list of Intel Xeon's to choose from, how can we select the best one for our needs?
I appreciate that on a fundamental level, "more" = "better", but how can I tell if for instance a faster bus speed makes a bigger difference compared to a large cache or if more cores are better than clock speed.
So is there a way of profiling our software package to figure out what processor to choose. The software in question is a collection of Python scripts, so we can do the profiling on Linux or Windows.
Have you taken a look at cachegrind, which is part of the really excellent Valgrind memory profiling package?
Cachegrind will at least give you an idea of how much cache thrashing is happening. You may find that your application thrashes the cache so much that it doesn't matter whether you have the larger L2/L3 cache of a Xeon or not, then again, you may be bound by the CPU pipeline and thrashing may not be occurring that much.
Cachegrind will also allow you to set arbitrary cache sizes, so you'll be able to test your code under various cache-size scenarios.