    Performance loss migrating code on different HPC cluster




      I'm migrating in-house developed software from a cluster with Intel Xeon X5550 processors, to a cluster with Intel Xeon E5-2650 v2 processors and experiencing a loss of performance. I checked the specifics of the processors and it seems like E5-2650 processors should give me better performance, allowing also for AVX instruction extensions. I'm observing slower runs instead, up to 50% slower.


      The clusters have the same compiler and I'm using the same compilation flags, apart from the -mavx option that I add for the new cluster:


      -O3 -shared-intel -free -align all -mavx -xHost -opt-mem-bandwidth2 -finline-functions -inline all -no-inline-min-size -fp-model fast=2 -unroll -unroll-aggressive -warn nointerfaces -nogen-interfaces -fpp -lstdc++ -align array64byte -ipo


      I tried compiling with -O2 instead of -O3, but the code is slower on both clusters, and tried to remove -mavx and -xHost flags but there is no observable difference.

      Can anyone help me understanding if I'm doing something wrong here?

      Thank you very much,