A lot has been written in the HPC field about how it doesn’t matter how many cores you have in a processor if you can’t keep them busy. Well, I will pile on with my own proof point and opinion. One of the key factors to keeping the processor cores busy is how fast you can move data in and out of cores. In some cases, customers maximize total application performance by optimizing around performance per core. This also often results in extracting the best performance return on your application licensing costs (if you happen to be licensing your application on a per core, process, or software token basis). The performance per core optimization occurs by turning some cores in the processor and thus allowing the remaining active cores more memory bandwidth-- spreading your workload among more processor sockets or servers while keeping the number of cores constant. So why would you turn off cores in one of the best processor in the market today?
Consider the Nehalem-EP processor example running a memory bandwidth intensive energy application (refer to Fig). The base case scenario (relative elapsed time = 1.0) corresponds to running the application on 2 dual socket processors servers using all cores active (4 cores per processor). Now, moving to the right, 4 dual socket processor servers using only 2 of 4 cores active per processor, maintaining the number of total cores constant, resulted roughly in a 30% improvement in application elapsed time. In other words, moving from 2 two sockets servers with all cores active per processor to 4 two sockets servers with half cores active per processor resulted in a about 30% improvement in application performance. If you license software costs are on a per core or process basis, then you just used the same number of total software resources while achieving 30% faster results. Results might vary on your application, but there a number of applications in the Energy and CAE field would exhibit this type of scaling behavior.
With upcoming multi-core products in the horizon from not just Intel, it is important to keep in mind what HPC folks have known all along, it is about optimal balance of performance and I would further add to this: extracting the maximum performance out of your software licensing costs. With the Intel® Xeon® processor 5500 series (Nehalem-EP) in the market today and upcoming Westmere-EP processor, we aim to give HPC users just that.