12 Replies Latest reply on Dec 10, 2017 9:11 PM by Anju_Paul

    Not getting performance acceleration on Nervana Dev Cloud

    srivatsa96

      I have recently got access to Intel Nervana Dev Cloud as a part of student ambassador program. I tried running my TensorFlow codes for Neural Captioning System but i am not getting a speedup in training compared to my local computer. I am doubtful whether TensorFlow is utilising underlying MKL Library or not.  A code which took 12 mins when previously ran on google cloud with 8 core virtual CPU Intel Ivy Bridge, took 5 hours on cluster. I am probably making a mistake but not able to figure out. Please help

       

      Regards

      Srivatsa

      Intel Student Ambassador

        • 1. Re: Not getting performance acceleration on Nervana Dev Cloud
          HemanthK

          Hi Srivatsa,

          Let us know the version of TensorFlow that you are accessing.

           

          On DevCloud, TensorFlow(v1.4) with MKL is already built into the Intel Distribution for Python installed on the cluster.

          The easiest way to access TensorFlow (v1.4) is to add the following lines to the

           

          '~/.bash_profile' script.

           

           

           

           

          PATH=/glob/intel-python/python2/bin/:/glob/development-tools/gcc/bin:$PATH:$HOME/bin

          LD_LIBRARY_PATH=/glob/development-tools/gcc/bin/lib64:$LD_LIBRARY_PATH

           

           

           

           

           

           

          You will have to either log out/back in or run the following for the changes to
          take effect:

           

           

           

               [u100@c009 ~]# source~/.bash_profile

           

          Best

          Hemanth

          • 2. Re: Not getting performance acceleration on Nervana Dev Cloud
            srivatsa96

            Hi Hemanth,

             

            I checked the version, its 1.2.1 and not 1.4. Do i need to explicitly update this?

             

            Further i also get following warning when i run my code

            [u6717@c009 ImageCaptioningModel]$ qsub run.sh

            26499.c009

            [u6717@c009 ImageCaptioningModel]$ qpeek -e -f 26499

            2017-11-19 10:04:40.012650: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.

            2017-11-19 10:04:40.012691: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.

            2017-11-19 10:04:40.012696: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.

            2017-11-19 10:04:40.012700: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.

            2017-11-19 10:04:40.012703: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX512F instructions, but these are available on your machine and could speed up CPU computations.

            2017-11-19 10:04:40.012706: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.

             

            What i believe is its not able to use AVX512F instruction set to optimise performance on Intel Xeon Scalable processors.

            • 3. Re: Not getting performance acceleration on Nervana Dev Cloud
              srivatsa96

              and yes i added those line. I followed the instructions from colfax intial login page

              • 4. Re: Not getting performance acceleration on Nervana Dev Cloud
                HemanthK

                Hi,

                The AVX512 warnings are pretty normal according to our TensorFlow optimization engineers. Just in case we'll double check and get back to you.

                Meanwhile please continue to use this version.

                 

                Thanks

                Hemanth

                • 5. Re: Not getting performance acceleration on Nervana Dev Cloud
                  srivatsa96

                  Hi,

                  One more thing i found while trying stuffs was that there exist no path /glob/development-tools/gcc/bin/lib64 but it is /glob/development-tools/gcc/lib64. I am not exactly sure if this is relevant. Just wanted to inform you

                   

                  Regards

                  Srivatsa

                  • 6. Re: Not getting performance acceleration on Nervana Dev Cloud
                    Anju_Paul

                    Hi,

                     

                    Just to check if it is a version issue, could you please try creating a new conda environment with tensorflow.

                     

                    Steps are given below:

                     

                    conda create –n tensorflowenv python=2/3

                    source activate tensorflowenv

                    conda install -c intel tensorflow

                     

                    The above steps uses conda and install a higher version of tensorflow 1.3.1.

                    You may need to include environment activation (source activate tensorflowenv) in the script you submit to queue.

                     

                    Regards,

                    Anju

                    • 7. Re: Not getting performance acceleration on Nervana Dev Cloud
                      srivatsa96

                      Hi Anju

                       

                      I tried creating virtual environment but i always get following error

                       

                      CondaError: Cannot link a source that does not exist. /glob/intel-python/versions/2018/intelpython2/pkgs/setuptools-27.2.0-py27_intel_0/lib/python2.7/site-packages/setuptools/lib2to3_ex.pyc

                       

                      I tried setting appropriate environment variables but it was futile.

                       

                      Regards

                      Srivatsa

                      • 8. Re: Not getting performance acceleration on Nervana Dev Cloud
                        Anju_Paul

                        Hi,

                         

                        We will check on this and give you an update soon.

                         

                        Meanwhile, could you please check if the below mentioned steps work for you:

                         

                        conda config --get channels # This will give you all the available channels.

                        conda config --remove channels intel # Run this command only if you find the intel channel is listed for the above command

                        conda create –n tensorflowenv python=2/3

                        source activate tensorflowenv

                        conda config --add channels intel

                        conda install tensorflow

                         

                        Regards,

                        Anju

                        • 9. Re: Not getting performance acceleration on Nervana Dev Cloud
                          srivatsa96

                          Hi,

                          The command conda config --get channels dosen't list anything

                          • 10. Re: Not getting performance acceleration on Nervana Dev Cloud
                            Anju_Paul

                            Hi,

                             

                            Shall I assume that the command did not throw an error, but did not list anything ?

                            If that is the case, please  go ahead and try the last part of commands if you are using python 2.

                             

                            conda create –n tensorflowenv python=2

                            source activate tensorflowenv

                            conda config --add channels intel

                            conda install tensorflow

                             

                            If you are using python3, then the following commands should work :

                             

                            conda config --add channels intel

                            conda create –n tensorflowenv python=3

                            source activate tensorflowenv

                            conda install tensorflow

                             

                             

                            Regards,

                            Anju

                            • 11. Re: Not getting performance acceleration on Nervana Dev Cloud
                              srivatsa96

                              Hello,

                               

                              Tried but no improvement.

                              Some Stats:
                              Its 1.19 iterations / second on my local machine(2nd Gen i3/ 4GB RAM) and 1.39 iterations / second on cluster.
                              It takes 1 minute 49 seconds on my local machine and 1 minute 32 seconds on cluster.

                               

                              My Code
                              GitHub - srivatsa96/image-captioning: Image Captioning Model based on Show and Tell model using VGG16 FC7 features (Tran…

                              1 of 1 people found this helpful
                              • 12. Re: Not getting performance acceleration on Nervana Dev Cloud
                                Anju_Paul

                                Hi,

                                 

                                This thread is now taken offline for detailed analysis of the code.

                                Will provide the updates once the problem is resolved.

                                 

                                Regards,

                                Anju