<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:clearspace="http://www.jivesoftware.com/xmlns/clearspace/rss" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>The Server Room Blog</title>
    <link>http://communities.intel.com/community/openportit/server/blog</link>
    <description>Server Room</description>
    <pubDate>Wed, 08 Jul 2009 22:54:49 GMT</pubDate>
    <generator>Clearspace 2.5.9 (http://jivesoftware.com/products/clearspace/)</generator>
    <dc:date>2009-07-08T22:54:49Z</dc:date>
    <item>
      <title>Coding Scalable or Parallel Applications – It Is Not As Hard You Think</title>
      <link>http://communities.intel.com/community/openportit/server/blog/2009/07/08/coding-scalable-or-parallel-applications-it-is-not-as-hard-you-think</link>
      <description>&lt;!-- [DocumentBodyStart:38eb1ece-3d36-4821-89f0-cbe00770f322] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p class="MsoNormal" style="margin: 0in 0in 6pt;"&gt;&lt;span style="color: black; font-size: 12pt;"&gt;&lt;span style="font-family: Calibri;"&gt;The need to write scalable applications has been important for programmers in the HPC community for years. Now, with the proliferation of multi/many-core processors developing scalable software is now a top priority for many programmers.  &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0in 0in 6pt;"&gt;&lt;span style="color: black; font-size: 12pt;"&gt;&lt;span style="font-family: Calibri;"&gt;Andrew S. Tanenbaum stated at the USENIX ’08 conference last year that developing “sequential programming is really hard” … the &lt;em&gt;difficulty&lt;/em&gt; is “parallel programming is a step beyond that.”  &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;span style="color: black; font-size: 12pt;"&gt;&lt;span style="font-family: Calibri;"&gt;He is right, but let’s illustrate why it is just a small step.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;span style="color: black; font-size: 12pt;"&gt;&lt;span style="font-family: Calibri;"&gt;Here is the point – parallel architectures will continue proliferating and we will need to develop and refine parallel algorithms that exploit parallelism. While difficult, to develop and refine parallel algorithms, the actual programming of these new algorithms, does not need to be hard.  However, if the developer is required to know the intimate details of the hardware then the development and refinement parallel algorithms can be very difficult, and very time consuming.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;&lt;span style="font-family: Calibri;"&gt;One approach provided by Intel software developer tools is to abstract away the details of the hardware.  This allows the developer to focus on their algorithms /applications, and rely on Intel software developer tools to provide the best optimizations for current and future platform&lt;span style="color: black;"&gt; While you may give up some performance by being abstracted away, what you lose in performance will be rewarded by your ability to quickly iterate through more iteration of your parallelization ideas in less time.  You may find yourself designing and developing better approaches to parallelism because you were able to test more hypotheses.  &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;span style="color: black; font-size: 12pt;"&gt;&lt;span style="font-family: Calibri;"&gt;An additional by-product of being abstracted away from having to know the intricacies of the hardware is that your software will be highly adaptable to future platforms.  You will see tremendous improvements on multi-core solutions and will be in a great position scale your application performance forward as newer architectures are made available.  &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin: 0in 0in 0pt;"&gt;&lt;span style="font-family: Calibri;"&gt;&lt;span style="color: black; font-size: 12pt;"&gt;To learn more Intel Software Tools and the benefits of optimizing your software on multi core based solutions first visit &lt;/span&gt;&lt;span style="font-size: 12pt;"&gt;&lt;a class="jive-link-external-small" href="http://software.intel.com/en-us/intel-sdp-home/"&gt;http://software.intel.com/en-us/intel-sdp-home/&lt;/a&gt;&lt;span style="color: black;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:38eb1ece-3d36-4821-89f0-cbe00770f322] --&gt;</description>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance_tuning</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">software_tools</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">optimization</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">tuning</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance_leadership</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance_benchmark</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">benchmark</category>
      <pubDate>Wed, 08 Jul 2009 22:54:49 GMT</pubDate>
      <author>wesley.e.shimanek@intel.com</author>
      <guid>http://communities.intel.com/community/openportit/server/blog/2009/07/08/coding-scalable-or-parallel-applications-it-is-not-as-hard-you-think</guid>
      <dc:date>2009-07-08T22:54:49Z</dc:date>
      <clearspace:dateToText>4 months, 3 weeks ago</clearspace:dateToText>
      <clearspace:replyCount>1</clearspace:replyCount>
      <wfw:comment>http://communities.intel.com/community/openportit/server/blog/comment/coding-scalable-or-parallel-applications-it-is-not-as-hard-you-think</wfw:comment>
      <wfw:commentRss>http://communities.intel.com/community/openportit/server/blog/feeds/comments?blogPost=12332</wfw:commentRss>
    </item>
    <item>
      <title>Considerations when tuning your Intel Xeon Processor 5500 series based server?</title>
      <link>http://communities.intel.com/community/openportit/server/blog/2009/03/30/considerations-when-tuning-your-intel-xeon-processor-5500-series-based-server</link>
      <description>&lt;!-- [DocumentBodyStart:d12604f2-ac1e-4253-8ab5-f46bf339a0d8] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;h1 style="MARGIN: 24pt 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="color: #365f91;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h1&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;This blog post is meant to discuss some of the considerations for performance tuning your Intel® Xeon® Processor 5500 (“Nehalem-EP”) series based server. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt; I’d like to do this by discussing the un-boxing process.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;Step 1. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Place the box on the floor&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;Step 2. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Open the box&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;Step 3. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Carefully remove the server.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;Step 4. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Plug the server into a keyboard, mouse, and monitor.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;Step 5. Plug the server into the wall socket.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;Step 6. Power on the server.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt; color: #000000;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt; color: #000000;"&gt;There. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;You are done tuning your Nehalem-EP based server for performance. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;“Really?” you ask? &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Well mostly. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;There are some considerations and I’ll discuss them. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;I can speak to this subject as I was asked to tune this class of system using the&lt;/span&gt;&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.tpc.org/tpcc/default.asp"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;TPC-C&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt; color: #000000;"&gt;and&lt;/span&gt;&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.tpc.org/tpce/default.asp"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;TPC-E&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;benchmarks.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;h2 style="MARGIN: 10pt 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 14pt;"&gt;&lt;span style="color: #4f81bd;"&gt;BIOS / Firmware / Drivers&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;&lt;p style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;It is very important to remember to update your system's BIOS, firmware, and OS drivers before you do any deep performance tuning. I cannot over state the importance of this step. Your system's manufacturer should be able to provide the latest BIOS and firmware associated with your server. OS drivers are available through many sources these days. Typically these can be downloaded from OS vendors, hardware vendors, from the Linux open source community, or the platform's manufacturer.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;A good example of this is the SATA driver associated with the ICH10. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;The ICH10 is part of the chipset that supports Nehalem-EP. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;I recommend going to Intel’s website and using the Intel Matrix Storage Manager driver for the SATA controller.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;h2 style="MARGIN: 10pt 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 14pt;"&gt;&lt;span style="color: #4f81bd;"&gt;Understand your system&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt; color: #000000;"&gt;Last year, Nehalem launched for the desktop market segment. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Now it is time for the server market. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;The Nehalem-EP processor is meant to be used in dual processor (DP) socket systems. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Nehalem-EP is the follow on to the Intel Xeon Processor 5400 (“Harpertown”) series. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;However, Nehalem-EP is really very different from Harpertown. The Nehalem-EP processor is based on the&lt;/span&gt;&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.intel.com/products/processor/corei7/index.htm"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;Intel Core i7&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt; color: #000000;"&gt;. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;The Nehalem-EP processor inherits the same architectural features as the Intel Core i7. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Once you understand these features, then you can better tune your system for&lt;/span&gt;&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.intel.com/performance/server/index.htm?iid=perf+sv_all_img"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;performance&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;h3 style="MARGIN: 10pt 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #4f81bd;"&gt;L3 Cache:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;Nehalem-EP uses a level 3 cache. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Depending on which SKU you are using it can be 4MB or 8MB in size. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;If you are interested in performance, then I would encourage you to pick the larger cache size SKU. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;h3 style="MARGIN: 10pt 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #4f81bd;"&gt;Hyper Threading Technology:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt; color: #000000;"&gt;If some threads is good then more threads is better. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;This is where&lt;/span&gt;&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.intel.com/technology/platform-technology/hyper-threading/"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;Hyper threading technology&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;comes in to play. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Nehalem-EP provides this technology out of the box. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;So, on a typical DP server this will give your system 16 threads of processing goodness.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;h3 style="MARGIN: 10pt 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #4f81bd;"&gt;Intel Quick Path Interconnect:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt; color: #000000;"&gt;Nehalem-EP supports a CPU interconnect known as Intel&lt;/span&gt;&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.intel.com/technology/quickpath/introduction.pdf"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;Quick Path Interconnect (QPI)&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;This interconnect is the replacement for the Front Side Bus of old. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;QPI provides a point to point link to each of the processors and the Intel X58 chipset. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;The Nehalem-EP supports QPI speeds of up to 6.4GT/s. This provides a theoretical bandwidth of 25.6 GB/s. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;This is a welcome shift for Intel’s designs for the future.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;h3 style="MARGIN: 10pt 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #4f81bd;"&gt;Turbo Boost Technology:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt; color: #000000;"&gt;As with the desktop SKUs of Nehalem, the Nehalem-EP supports&lt;/span&gt;&lt;/span&gt; &lt;a class="jive-link-external-small" href="http://www.intel.com/technology/turboboost/index.htm?iid=tech_arch_nextgen+body_turboboost"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;Turbo Boost technology&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;This technology will run the CPU at a higher frequency than its rating. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;It will increase the frequency in steps of 133MHz until it achieves its upper thermal and electrical design requirements. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Turbo Boost Technology is dynamic. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;In other words, the processor will decrease its core frequency if the temperature is too high. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;If your application is sensitive to core frequency changes and does not fully utilize all cores, then it may benefit from this technology.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;h3 style="MARGIN: 10pt 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #4f81bd;"&gt;Integrated Memory Controller:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;Another key feature of Nehalem based processors is that they have the memory controller integrated into the processor. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;This allows for much lower memory latencies. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;The Nehalem-EP supports three channels of DDR3 memory. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;It is important to talk about DDR3 memory and population on Nehalem-EP based servers. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;As mentioned before Nehalem-EP supports three channels of memory and supports 800, 1067, and 1333 MT/s memory speeds. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Those speeds are dependent on how many channels are populated with DIMMs. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;For instance, 1333 MT/s is supported in a single DIMM per channel configuration. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;1067MT/s is supported in a single DIMM per channel and two DIMMs per channel configuration. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;800MT/s is supported in all configurations. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;These speeds are based on dual ranked DIMMs. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;If you plan on filling up all the memory slots with as many DIMMS as possible you will end up running at 800MT/s. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;So, here is the consideration. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Does your application need all that memory or could it use less memory running at a higher speed? &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;If the answer is yes to the latter, then perhaps running two DIMMS per channel at 1067 MT/s is the best configuration.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;&lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt; color: #000000;"&gt;To wrap things up here, we have looked at the new and Nehalem architecture, the importance of BIOS/ firmware/ OS drivers, and memory population. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;Your application's performance will vary, but I hope I have given you some things to narrow down your performance testing. Thanks for taking the time to read this blog post. &lt;span style="mso-spacerun: yes"&gt;&lt;/span&gt;For more great performance methodology tips please check out Shannon Cepeda’s&lt;/span&gt;&lt;/span&gt; &lt;a class="jive-link-blog-small" href="http://communities.intel.com/community/openportit/server/blog/2007/12/20/10-habits-of-great-server-performance-tuners"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;blog&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt;"&gt;&lt;span style="color: #000000;"&gt;posts on performance tuning.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="MARGIN: 0in 0in 0pt"&gt;&lt;span style="font-family: 'Arial','sans-serif';"&gt;&lt;span style="font-size: 12pt; color: #000000;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:d12604f2-ac1e-4253-8ab5-f46bf339a0d8] --&gt;</description>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">xeon_5500</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">nehalem</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance_benchmark</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance_tuning</category>
      <pubDate>Mon, 30 Mar 2009 21:42:26 GMT</pubDate>
      <author>dan.krogh@intel.com</author>
      <guid>http://communities.intel.com/community/openportit/server/blog/2009/03/30/considerations-when-tuning-your-intel-xeon-processor-5500-series-based-server</guid>
      <dc:date>2009-03-30T21:42:26Z</dc:date>
      <clearspace:dateToText>8 months, 3 days ago</clearspace:dateToText>
      <clearspace:replyCount>3</clearspace:replyCount>
      <wfw:comment>http://communities.intel.com/community/openportit/server/blog/comment/considerations-when-tuning-your-intel-xeon-processor-5500-series-based-server</wfw:comment>
      <wfw:commentRss>http://communities.intel.com/community/openportit/server/blog/feeds/comments?blogPost=11997</wfw:commentRss>
    </item>
    <item>
      <title>Things to consider when tuning your MP Xeon 7400 series server</title>
      <link>http://communities.intel.com/community/openportit/server/blog/2008/09/29/things-to-consider-when-tuning-your-mp-xeon-7400-series-server</link>
      <description>&lt;!-- [DocumentBodyStart:1e711b30-3da4-40b0-8a1d-0100d51110fc] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;The following are some considerations prior to tuning your MP Xeon 7400 series server. I can speak to this subject as I was asked to tune this system using the &lt;a class="jive-link-external-small" href="http://www.tpc.org/tpcc/default.asp"&gt;TPC-C&lt;/a&gt; and &lt;a class="jive-link-external-small" href="http://www.tpc.org/tpce/default.asp"&gt;TPC-E&lt;/a&gt; benchmarks for internal measurements at Intel. While you may not be setting up thousands of hard disk spindles for your performance work, this blog post attempts to capture some of the key tuning considerations of this Xeon-based server.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;h2&gt;&lt;span&gt;Understand your system&lt;/span&gt;&lt;/h2&gt;&lt;p&gt;The key to tuning any system, whether it is a formula one race car (I promise to stay away from silly car performance analogies) or a server is to understand it. Identify what components have an effect on performance and what components don't. This will narrow down your tuning efforts. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;h3&gt;&lt;span&gt;Architecture&lt;/span&gt;&lt;/h3&gt;&lt;p&gt;Like all of Intel's platforms, an MP Xeon 7400 series server is made of several ingredients. Of course since I work for Intel I need to start out the ingredient list with the central processor. Our website has a good description of this processor &lt;a class="jive-link-external-small" href="http://download.intel.com/products/processor/xeon/7400_prodbrief.pdf"&gt;here&lt;/a&gt; The MP Xeon 7400 processor is made up of three Core 2 Duo T5000/T7000 series processors. This provides six (yes six) cores for processing goodness. Each of the Core 2 Duo T5000/T7000 series processors provide 2 32KB level 1 caches (1 for data and 1 for code) and a 3MB level 2 unified cache. In addition to these two levels of cache the MP Xeon 7400 processor provides a 16MB level 3 unified cache. The other major ingredient to this platform is the &lt;a class="jive-link-external-small" href="http://www.intel.com/products/server/chipsets/7300/7300-overview.htm"&gt;Intel® 7300 Chipset&lt;/a&gt;. This chipset provides four independent front side bus links to the four CPU sockets. In addition, this chipset provides a snoop filter and four channels of &lt;a class="jive-link-external-small" href="http://en.wikipedia.org/wiki/FB-DIMM"&gt;FBD&lt;/a&gt; memory.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;h4&gt;&lt;span&gt;If some is good, then more is better:&lt;/span&gt;&lt;/h4&gt;&lt;p&gt;The key thing to take away here is that an MP Xeon 7400 system fully populated with top bin processors will provide a whopping 24 cores of processing power in a four socket system. This is great for the enterprise benchmarks I use for performance testing as those applications are multithreaded and designed for multi-core processors. The same may not be true for your application, so please keep that in mind. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Another thing to remember is that an MP Xeon 7400 processor's design follows a growing pattern in the Xeon processor family. Specifically, I am referring to the addition of the level 3 cache (L3). This is also known as the last level cache (LLC). This follows the design of the &lt;a class="jive-link-external-small" href="http://download.intel.com/products/processor/xeon/procbrief.pdf"&gt;Potomac&lt;/a&gt; (Xeon MP 64-bit) and &lt;a class="jive-link-external-small" href="http://www3.intel.com/cd/channel/reseller/asmo-na/eng/products/server/processors/7000/feature/index.htm"&gt;Tulsa&lt;/a&gt; (7100-series) processors. The value of the large LLC is that it reduces the number of cache misses that would require the machine to go to FBD memory for the latest copy of a cache line. This additional level of on-chip cache comes at a price, though: higher latency. While the latency penalty is relatively low when compared to the latency to memory it is important to mention it here. Again, the LLC greatly benefits enterprise benchmarks I use for performance testing as they have a large memory footprint. The same may not be true for your application. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;h2&gt;&lt;span&gt;BIOS / Firmware / Drivers&lt;/span&gt;&lt;/h2&gt;&lt;p&gt;It is very important to remember to update your system's BIOS, firmware, and OS drivers before you do any deep performance tuning. I can not over state the importance of this step. Your system's manufacturer should be able to provide the latest BIOS and firmware associated with your server. OS drivers are available through many sources these days. Typically these can be downloaded from OS vendors, hardware vendors, from the Linux open source community, or the platform's manufacturer.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;h2&gt;&lt;span&gt;Prefetchers&lt;/span&gt;&lt;/h2&gt;&lt;p&gt;Intel processors have traditionally provided four prefetchers. These are accessible via model specific register IA32_MISC_ENABLE and sometimes via your OEMs BIOS. These features are meant to help the processor load data in a predictive manner to keep the cache hierarchy filled with the most pertinent cache lines. This is great if the application uses data in a somewhat predictable way. If your application uses cache lines in a random fashion, then the prefetchers may negatively impact performance. My best advice for you is to test your application with the prefetchers enabled and disabled. Table B-3 (MSR 0x1A0) in &lt;a class="jive-link-external-small" href="http://download.intel.com/design/processor/manuals/253669.pdf"&gt;this link&lt;/a&gt; covers the prefetchers I am referring to.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;h2&gt;&lt;span&gt;Memory Population&lt;/span&gt;&lt;/h2&gt;&lt;p&gt;As mentioned before, an MP Xeon 7400 series server will provide four channels of FBD memory. There are a couple of considerations here. First, latency to memory increases for every DIMM added to the system. This is important to note because you can keep the memory latency to a minimum by adding fewer high capacity DIMMs. Second, be sure to evenly distribute the DIMMs across all the channels. In other words, don't fill up all the slots on one channel and then lightly populate the rest. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;h2&gt;&lt;span&gt;An External Factor that may affect performance&lt;/span&gt;&lt;/h2&gt;&lt;p&gt;Like many Intel designs, an MP Xeon 7400 series server will choose dishonor over death. I am referring to how it deals with high temperatures. The FBD memory inside an MP Xeon 7400 series server makes use of a thermal monitor on each DIMM. If the memory becomes too hot the chipset will begin to throttle memory bandwidth in an effort to reduce the temperature of the system. This will have a drastic negative impact to performance. So, keep your server room nice and cool. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt; &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;To wrap things up here, we have looked at the architecture, the importance of BIOS/ firmware/ OS drivers, the prefetchers, memory population, and the effects of high temperatures. Your application's performance will vary, but I hope I have given you some things to narrow down your testing. So, by now you might be asking. "Where do I start?" Well not to be too self serving, but I would check out more of our blog posts here. A great place to start for performance methodologies would be Shannon Cepeda's &lt;a class="jive-link-external-small" href="http://communities.intel.com/openport/blogs/server/2007/12/20/10-habits-of-great-server-performance-tuners"&gt;blog&lt;/a&gt;. This series is a great resource for anyone interested in computer performance methodologies.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:1e711b30-3da4-40b0-8a1d-0100d51110fc] --&gt;</description>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">dunnington</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance_tuning</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance_benchmark</category>
      <pubDate>Tue, 30 Sep 2008 00:26:24 GMT</pubDate>
      <author>dan.krogh@intel.com</author>
      <guid>http://communities.intel.com/community/openportit/server/blog/2008/09/29/things-to-consider-when-tuning-your-mp-xeon-7400-series-server</guid>
      <dc:date>2008-09-30T00:26:24Z</dc:date>
      <clearspace:dateToText>1 year, 2 months ago</clearspace:dateToText>
      <wfw:comment>http://communities.intel.com/community/openportit/server/blog/comment/things-to-consider-when-tuning-your-mp-xeon-7400-series-server</wfw:comment>
      <wfw:commentRss>http://communities.intel.com/community/openportit/server/blog/feeds/comments?blogPost=11594</wfw:commentRss>
    </item>
    <item>
      <title>Server Performance Tuning Habit #2: Start at the Top</title>
      <link>http://communities.intel.com/community/openportit/server/blog/2008/01/25/server-performance-tuning-habit-2-start-at-the-top</link>
      <description>&lt;!-- [DocumentBodyStart:99616c12-d313-425f-a052-57b075452b8e] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;Here's the 2nd follow-up post in my &lt;a class="jive-link-blog-small" href="http://communities.intel.com/community/openportit/server/blog/2007/12/20/10-habits-of-great-server-performance-tuners"&gt;10 Habits of Great Server Performance Tuners&lt;/a&gt; series. This one focuses on the second habit: Start at the top. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Let me start by relating a true (although simplified) story. My team at Intel has built up years of expertise running a particular benchmark. So when the time came to start running a new, similar benchmark, we thought: "No problem." We began running tests while the benchmark was still in development. Immediately we had an issue: the type of problem that would normally indicate our hardware environment wasn't set up properly. We checked everything that we had seen cause the issue in the past, and we couldn't find anything. So, we blamed the new benchmark. After all, we were experts and we had been setting up these environments for years! We knew what we were doing. You can probably guess where this story is going: after weeks of doing things to work around the "benchmark issue", we figured out that we &lt;em&gt;had&lt;/em&gt; mis-configured the environment, resulting in a bottleneck on one part of our testbed. We didn't thoroughly test that part of the environment because it had never caused us problems with the old benchmark. And of course, on the new benchmark it was critical. We had broken one of the most important rules of performance tuning: Start at the Top. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So now you know how easy it can be to &lt;em&gt;not&lt;/em&gt; Start at the Top. Even seasoned performance engineers can get overconfident and forget this rule. But the consequences can be dire:&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;ul&gt;&lt;li level="1" type="ul"&gt;&lt;p&gt;1. You have to eat major crow when you realize your mistake. I'm just now getting over the humiliation.&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ul"&gt;&lt;p&gt;2. You might have put tunings in place to address issues that weren't really there. This is at best wasted work and at worst something that you have to painstakingly undo when you fix the &lt;em&gt;real&lt;/em&gt; issue.&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So...how do you avoid this situation? Simple: use the Top-Down Performance Tuning process. This means you start by tuning your hardware. Then you move to the application/workload, then to the micro-architecture (if possible). What you are looking for at each level are &lt;a class="jive-link-external-small" href="http://en.wikipedia.org/wiki/Bottleneck_(engineering)"&gt;bottlenecks&lt;/a&gt;: situations where one component of the environment or workload is limiting the performance of the whole system. Your goal is to find any system-level bottlenecks before you move down to the next level. For example, you may find that your network bandwidth is bottlenecked and you need to add another NIC to your server. Or that you need to add another drive to your RAID array, or that your CPU load is being distributed un-evenly. Any bottlenecks involving your server system hardware (processors, memory, network, HBAs, etc), attached clients, or attached storage is a system-level bottleneck. Find these by using system-level tools (which I will touch on in the future blog for Habit #8), remove them, then proceed to the application/workload level and repeat the process. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Being vigilant about using the top-down process will ensure you don't waste time tuning a non-representative system. And it just may save you some embarrassment! &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;a href="http://communities.intel.com/servlet/JiveServlet/showImage/38-10859-1225/IMG_2506-measureBottleneck-edit2-x250.jpg"&gt;&lt;img height="317" src="http://communities.intel.com/servlet/JiveServlet/downloadImage/38-10859-1225/250-317/IMG_2506-measureBottleneck-edit2-x250.jpg" width="250"/&gt;&lt;/a&gt; &lt;/p&gt;&lt;p&gt;&lt;span style="color:#0000ff"&gt;Always measure your bottlenecks!&lt;/span&gt; &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Keep watching The Server Room for information on the other 8 habits in the coming weeks. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:99616c12-d313-425f-a052-57b075452b8e] --&gt;</description>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance_tuning</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance_benchmark</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">bottleneck</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">shannon_cepeda</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">cepeda</category>
      <pubDate>Sat, 26 Jan 2008 00:18:38 GMT</pubDate>
      <author>shannon.g.cepeda@intel.com</author>
      <guid>http://communities.intel.com/community/openportit/server/blog/2008/01/25/server-performance-tuning-habit-2-start-at-the-top</guid>
      <dc:date>2008-01-26T00:18:38Z</dc:date>
      <clearspace:dateToText>1 year, 10 months ago</clearspace:dateToText>
      <wfw:comment>http://communities.intel.com/community/openportit/server/blog/comment/server-performance-tuning-habit-2-start-at-the-top</wfw:comment>
      <wfw:commentRss>http://communities.intel.com/community/openportit/server/blog/feeds/comments?blogPost=10859</wfw:commentRss>
    </item>
    <item>
      <title>Performance Benchmarking 101</title>
      <link>http://communities.intel.com/community/openportit/server/blog/2007/11/26/performance-benchmarking-101</link>
      <description>&lt;!-- [DocumentBodyStart:ce83c503-1dfb-415a-8330-85fd74bacaa3] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;&lt;strong&gt;Take a look at the chart below ... it's telling you something... isn't it?&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;It's more than performance numbers and marketing, &lt;u&gt;it's data&lt;/u&gt;... &lt;strong&gt;REAL&lt;/strong&gt; data! &lt;/p&gt;&lt;p&gt;But what does it mean - and ultimately - how can &lt;u&gt;you&lt;/u&gt; relate to it? &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;a href="http://www.intel.com/performance/server/i/xeon_ppw1.jpg"&gt;&lt;img src="http://www.intel.com/performance/server/i/xeon_ppw1.jpg"/&gt;&lt;/a&gt; &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;If you're really into high-powered computing, you're probably quite familiar with common benchmark data. With every new CPU release, there are tons of new statistics, models, and ways to test the increased performance of the newer technology device - in this case, the 45nm based CPUs just recently launched this month. But what exactly does all this data amount to? Reading benchmarks is more than just seeing a bar chart - there's a science to digging into the data... &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;First, lets take a step back for some of you who may not fully understand what benchmarking is for. Benchmarks help to provide a common ground for comparing the performance of various systems across different CPU/system architectures. A common set of instructions (or programs) are setup to run within a regulated guideline to ensure the testing is performed equally across the competing platforms or architectures. Very much like in sports, if you have two different runners - they run the same path - i.e. the 100 yard dash. This creates the comparative benchmark. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So let's get back to the latest hot stuff - the &lt;a class="jive-link-external-small" href="http://www.intel.com/products/processor/xeon5000/index.htm?iid=servproc+body_xeon5000subtitle"&gt;Intel Xeon 5400 Series&lt;/a&gt; and &lt;a class="jive-link-external-small" href="http://www.intel.com/products/processor/core2XE/index.htm?iid=homepage+c2e"&gt;Core 2 Extreme QX9650&lt;/a&gt; Quad Core based processors. In the past 18 months, computing models have taken a giant leap forward by adding more CPU's per socket thereby increasing the thread density of your platform. In dual socket systems, you used to have two threads you now have four or even eight! And in quad socket systems the count can go up to 16! You're increasing your capacity to perform computational data by a factor of 3 or 4 depending on the platform. This has made a &lt;a class="jive-link-external-small" href="http://download.intel.com/technology/quad-core/server/quadcore-ghz-myth.pdf"&gt;tremendous change&lt;/a&gt; in how benchmarks have had to be setup to run and we have to evaluate the testing methods to ensure we're maximizing the computability of each platform. &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;There are a few key steps to take before you consider benchmarking your system:&lt;/strong&gt; &lt;/p&gt;&lt;ol&gt;&lt;li level="1" type="ol"&gt;&lt;p&gt;identify your problem area (processing power, network bandwidth, memory utilization, etc)&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ol"&gt;&lt;p&gt;identify your competing products&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ol"&gt;&lt;p&gt;evaluate the 'leaders' in your problem area&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ol"&gt;&lt;p&gt;survey for available benchmarking tools&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ol"&gt;&lt;p&gt;evaluate 'best practices' for testing (e.g. lower idle power based processors won't really help much if you're only doing high-end computing)&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ol"&gt;&lt;p&gt;and then - implement your findings in your chosen architecture(s)&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;In the high-end server space you usually see more vendor specific data rather than end-user testing. Primarily because of the finite set of data that server administrators are looking for. Many of these 'industry standards' are monitored for efficiency and ensure the end-user that the testing was properly performed and the results are repeatable: &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;!--[CodeBlockStart:2e0cabba-d099-43a9-8d84-62eddeb114d1]--&gt;&lt;span&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;!--[CodeBlockEnd:2e0cabba-d099-43a9-8d84-62eddeb114d1]--&gt;&lt;strong&gt;Industry Standard Benchmarks&lt;/strong&gt; &lt;/p&gt;&lt;ul&gt;&lt;li level="1" type="ul"&gt;&lt;p&gt;&lt;a class="jive-link-external-small" href="http://en.wikipedia.org/wiki/Standard_Performance_Evaluation_Corporation"&gt;Standard Performance Evaluation Corporation&lt;/a&gt; (SPEC)&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ul"&gt;&lt;p&gt;&lt;a class="jive-link-external-small" href="http://en.wikipedia.org/wiki/Transaction_Processing_Performance_Council"&gt;Transaction Processing Performance Council&lt;/a&gt; (TPC)&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ul"&gt;&lt;p&gt;&lt;a class="jive-link-external-small" href="http://en.wikipedia.org/wiki/BAPCo_consortium"&gt;BAPCoan&lt;/a&gt; industry consortium developing benchmarks for Windows personal computers&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ul"&gt;&lt;p&gt;&lt;a class="jive-link-external-small" href="http://www.synchromeshcomputing.com/servicesBenchmarking.html"&gt;Synchromesh Computing benchmark tests&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ul"&gt;&lt;p&gt;&lt;a class="jive-link-external-small" href="http://www.eembc.org/"&gt;The Embedded Microprocessor Benchmark Consortium (EEMBC)&lt;/a&gt;&lt;br/&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Intel uses many of these standards for benchmarking - as you can see here in the Xeon 5000 Series based Processors &lt;a class="jive-link-external-small" href="http://www.intel.com/performance/server/xeon/intthru.htm"&gt;Benchmark Page&lt;/a&gt; &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Even if you're a server admin, you most likely interact with clients for day to day performance as well. If you search the web for &lt;a class="jive-link-external-small" href="http://www.google.com/search?svnum=10&amp;amp;um=1&amp;amp;hl=en&amp;amp;rls=com.microsoft:en-us&amp;amp;q=cpu+benchmarks&amp;amp;ie=UTF-8&amp;amp;sa=N&amp;amp;tab=iw"&gt;CPU benchmarks&lt;/a&gt; the most commonly viewed benchmarks are performed on the client side of computing, mainly because of a few factors: &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;ol&gt;&lt;li level="1" type="ol"&gt;&lt;p&gt;clients are usually cheaper and more abundant to test with&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ol"&gt;&lt;p&gt;visuals in client computing are usually more fun to watch than seeing SQL data fly across the screen (hey - just being honest here!)&lt;/p&gt;&lt;/li&gt;&lt;li level="1" type="ol"&gt;&lt;p&gt;and servers in general are built for more specific reasons, whether it's application, storage, modeling or other specialties&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Many of you have probably heard of benchmark sites such as: &lt;a class="jive-link-external-small" href="http://www.anadtech.com/"&gt;Anandtech&lt;/a&gt;, &lt;a class="jive-link-external-small" href="http://www.tomshardware.com/"&gt;Toms Hardware&lt;/a&gt;, &lt;a class="jive-link-external-small" href="http://www.firingsquad.com/"&gt;FiringSquad&lt;/a&gt;, &lt;a class="jive-link-external-small" href="http://hardocp.com/"&gt;HardOCP&lt;/a&gt; and many others (respond with your favorites please!)  Each of these sites use common tools/applications to benchmark the latest and greatest hardware against each other.  Depending on what you're looking to do with your hardware really determines what/how you want to benchmark your system (or look for data reviews for your configuration).  After all, a &lt;a class="jive-link-external-small" href="http://www.firingsquad.com/hardware/intel_skulltrail_preview/"&gt;machine that can run the latest games at over 60 frames per second&lt;/a&gt; may not be the best SQL server for your datacenter - right? &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;If you're looking for quick 'brute force' computational tools to try your hand at CPU benchmarking, try something simple like &lt;a class="jive-link-external-small" href="http://boinc.berkeley.edu/"&gt;BOINC&lt;/a&gt;, &lt;a class="jive-link-external-small" href="http://www.xtremesystems.com/pi/"&gt;Super PI&lt;/a&gt;, or you can get more elaborate by using some methods as &lt;a class="jive-link-external-small" href="http://reviews.cnet.com/Labs/4520-6603_7-5020816-1.html"&gt;described by C-Net&lt;/a&gt; by using &lt;a class="jive-link-external-small" href="http://www.maxon.net/pages/download/cinebench_e.html"&gt;Cinebench&lt;/a&gt;, or &lt;a class="jive-link-external-small" href="http://www.sisoftware.net/"&gt;SiSoftware Sandra&lt;/a&gt;. Once you've figured out some of the basics - and can repeat these simpler tests - you can jump into those Industry Standards and get into some serious work! &lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So in closing, there are so many variables to account for when looking to validate the performance of a given system. Processor speeds, I/O subsystem configuration, memory latencies, network bandwidth, power utilization, etc... the permutations are nearly endless. So you have to be diligent in initially addressing your key problem(s), and attack the solution in benchmarking using the best known methods. Also, when reading benchmark information &lt;strong&gt;BE SURE&lt;/strong&gt; to read the configurations of the systems in question - are they truly comparable? are the components running at spec level or overclocked? Are the speed differences negligible, or substantial in real-world evaluation? And finally, focus on what's important to you and your computing requirements - after all, you need to be sure you've picked the correct system for your needs.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:ce83c503-1dfb-415a-8330-85fd74bacaa3] --&gt;</description>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance_benchmark</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">todd_christ</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">45nm</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">performance</category>
      <category domain="http://communities.intel.com/community/openportit/server/blog/tags">benchmark</category>
      <pubDate>Tue, 27 Nov 2007 00:15:00 GMT</pubDate>
      <author>todd.christ@intel.com</author>
      <guid>http://communities.intel.com/community/openportit/server/blog/2007/11/26/performance-benchmarking-101</guid>
      <dc:date>2007-11-27T00:15:00Z</dc:date>
      <clearspace:dateToText>2 years, 3 days ago</clearspace:dateToText>
      <clearspace:replyCount>2</clearspace:replyCount>
      <wfw:comment>http://communities.intel.com/community/openportit/server/blog/comment/performance-benchmarking-101</wfw:comment>
      <wfw:commentRss>http://communities.intel.com/community/openportit/server/blog/feeds/comments?blogPost=10773</wfw:commentRss>
    </item>
  </channel>
</rss>

