After a long time I have seen someone post his/her system's technical specs in such an elaborated manner that it left no room for me to inquire further about the system though you still missed on the HDD which does play an essential role as well in systems' performance, especially in I/O Never mind - it's all good for now.
Coming to your problem, you have mentioned that you have dual quad-core Xeons, application utilizes all the four cores when affinity is limited but if the process is made to utilize all the available system cores, the core load distributed evenly amongst all the cores bringing it down to 50% from 100% however your application still acts as if there is no further CPU capacity available; right! Okay now we know that you have a good graphics card so we shouldn't be having an issue with graphic handling part that GPU is contributing a vital role in by sharing the load with the CPU, okay - Good!
Can't seem to isolate the problem? Let me ask you, "did you ever brushed through an OS or applications' ability in terms of utilizing the number of cores?" If you have, you would know that despite the increasing number of cores on a die, not most or all the OS/applications are being upgraded to utilize all the cores including OS. Getting my point?
You see, Windows XP-Pro SP2 x86 that you use initially could not utilize all the cores of a Dual-Core processor but later with a release of a hot-fix on Microsoft's website for SP2, they overcame that problem. Furthermore, they upgraded XP-Pro with release of SP3 and they might have provided a hot-fix enabling it to utilize all four cores of a quad-core processor, I'm unaware of it yet as the quad-core desktop processors are not that old.
Now first you should have verified if XP-Pro SP2 was certified to work with E5420 in it's THOL which unfortunately it is NOT. Windows XP-Pro SP2 x86 is not certified to work on S5000PSL boards and since you are using that motherboard with a compatible E5420 quad-core processor, you can not though run it well but you will not get the performance you should actually get out of such a powerful server class hardware. XP-Pro is designed for desktop/laptop computers and is not initially designed for server class hardware so you see, there goes problem # 1.
Problem # 2 - Just like your OS (assuming XP-Pro SP2 is utilizing all the 8 cores), you should check your application's ability of max-core utilization. Write to your application manufacturer and inquire of them if their application is capable of running on a quad-core processor or utilizing 8 cores all in all.
From your specific scenario here, OS could be one problem but majorly, I sense that it's your application that is not capable of utilizing more than 4 cores at max as you stated the load balance of cores.
I hope you get my point and you find it useful.
XP-Pro SP2 Hot Fix for Dual Core: http://www.amdzone.com/index.php/news/windows/3928
OS Certification List (S5000PSL): http://www.intel.com/support/motherboards/server/sb/CS-022651.htm
Intel Go Green, Save The Environment!
Hey Jevad ,
Running the system with Win Server 2003 Standard R2 SP2 (+all available patches) helped, now the CPU usage limit for the application is increased to ~90% - however the overall performance of the system is poorer than the expected. I think some further optimizations will get us closer to be happy with the system soon
It's good to know that the processor performance for the specific application has increased to 90% lest it is delivering efficiently and providing you the desired result effectively however if the overall performance of the system has degraded, are all the cores being used at 90%?
You need to check the RAM utilization as well because as you said your RAM is 2GB, it could also be a possible reason that most of the RAM is being utilized by the application affecting the overall performance of the system. In that case, we will have to swtich to SWAP option lest you haven't created a swap drive before. We can go with further system tuning to isolate the system performance degradation. Also, can you check and confirm the RPMs of your HDD as I/O on a slower bandwidth HDD can also be a cause for slow fetching of data which does effect your system overall performance.
Update me on that and let us catch the nasty part and tune it to deliver better performance as it should
Intel Go Green, Save The Environment!
Yes, all eight CPU cores are runnning at 90% load. You are right, the RAM or HDD utilization could be bottleneck of the system, but the recording application is not memory-intensive, it consumes up to 400-600 MB, and in the current phase of the test there is no data writing to HDD, only frame capturing and analog-to-digital data conversion in the background.
It seems that the Adjacent Cache Line Prefetch function has a drawback effect to the application, could you please give me detailed information about it?
Well Cache line prefetch could have been a problem lest we are ignoring the memory latency but first have you ensured that you have enabled it in BIOS? If so, let me brief you a bit more in detail about it and it's working with Dual and Multi-core processors however before I proceed, as you have mentioned Advanced Cache Line Prefetch here, I would like to assume that you already know about it but only for the rest of the community members who could benefit from this discussion in anyway in future, let me give a brief overview of ACLP.
ACLP (Advanced Cache Line Prefetch function):
The processor has a hardware adjacent cache line prefetch mechanism+ that automatically fetches an extra 64-byte cache line whenever the processor requests for a 64-byte cache line. This reduces cache latency by making the next cache line immediately available if the processor +requires it as well.
When enabled, the processor will retrieve the currently requested cache line, as well as the subsequent cache line.+ +
When disabled, the processor will only retrieve the currently requested cache line.+ +
In a desktop system, enabling this feature improves performance as there's+ a high probability of the processor requiring the next cache line as well as the currently requested cache line. It is therefore recommended that you *enable< this BIOS feature in a desktop +system.
But in a server, the probability of the next cache line being+ required by the processor is lower than that of a desktop system. The higher cache miss ratio inevitably leads to higher bus utilization, +which reduces the processor's performance.
You will need to evaluate the performance effect of this feature on your server+ and determine if it should be disabled or enabled for better performance. But +servers should generally *disable *this feature.
Now, let's say you have ENABLED ACLP in BIOS and your processor is a multi-cored, supports SMP, the impact could be that whenever the processor reads data from memory in one of the test-simulated
situations – forward, backward i.e. corresponds to encoding video or some other
data stream processing as your case. You have multiple cores with independent cache lines and with prefetch enabled, processor should be able to fetch more and more data and make it available for use the moment processor is done with the processed data.
In conjecture to this, with ACLP enabled, you should be able to achieve better performance provided there are no memory latency issues. Take a look at the image below for memory latency and ACLP performance graph with ACLP enabled on n Intel Pentium 4 prescott 3.4GHz.
For detailed information on ACLP, read the attached document which is related and performs test to discover two methods for measuring memory latency on an Intel Pentium 4 Prescott series processor of 3.4GHz frequency that also discusses your scenario. Hope this helps, otherwise you are more than welcome to contact us for any further details or issues.
Intel Go Green, Save The Environment!