3 Replies Latest reply on Apr 28, 2016 1:34 PM by Intel Corporation

    AVX Base vs. AVX Turbo


      The documunt here (http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e5-v3-spec-update.pdf) states that a Xeon E5-2699 v3 Haswell-EP chip has an AVX Base Frequency of 1.9 GHz and an "AVX Turbo Boost Technology Maximum Core Frequency" (called AVX Turbo in the remainder of my post) of 2.6 GHz when using all 18 cores.


      How do I compute the peak DP GFlop/s for this chip? Assuming 1.9 GHz I arrive at 1.9 [GHz] * 18 [cores] * 2 [two FMA units] * 4 [AVX] * 2 [FMA] = 547.2 GFlop/s. Assuming 2.6 GHz I get 748.8. In some documents it says the Turbo can only be achieved for "most AVX" workloads (whatever that is). What factors impact whether my code qualifies to be included in "most AVX" workloads?


      I tested a chip using a dot product benchmark which issues two AVX loads and two AVX FMAs per cycle with a dataset in the L3 cache running on all cores which, according to RAPL counters, went over the TDP. However, the measured frequency was still 2.6 GHz. IMO a workload can't get much worse than that for the chip. Is it perhaps dependent on the quality of the chip whether my code will run at something below 2.6 GHz? In that case, depending on the chip quality, some people would get a chip that offers 30% more performance vs the one that can only do AVX base. That doesn't sound right. Can someone please clarify?