11 Replies Latest reply on Dec 13, 2017 12:03 AM by HemanthK

    The correct way to launcn hybrid mpi/openmp taks

    RobeDM

      Hello,

       

      I have implemented a distributed and multicore learning procedure for some machine learning algorithms (kernel methods family).

       

      I would like to make use of mpi to communicate different servers and openmp to use all the cores in a server.

       

      I am not sure about how to configure the PBS script to that end.

       

      In the first line I have used this:

      #PBS -l nodes=10:skl:ppn=2

       

      1) Is this correct to select 10 servers and 2 cores in every server?

       

      2) If this is correct why I obtain an error when I select more than two cores (e.g. ppn=3)?:

       

      >>qsub mytasksIRWLSd

      >>qsub: submit error (Job exceeds queue resource limits MSG=cannot locate feasible nodes (nodes file is empty, all systems are busy, or no nodes have the requested feature))

       

      When I use the command pbsnodes, the nodes have 24 cpus (the ncpus parameter), this is an example:

       

      c009-n032

           state = free

           power_state = Running

           np = 2

           properties = xeon,skl,gold6128,ram96gb

           ntype = cluster

           status = rectime=1512208377,macaddr=a4:bf:01:2c:bd:d0,cpuclock=Fixed,varattr=,jobs=,state=free,netload=6031534547,gres=,loadave=0.00,ncpus=24,physmem=97437376kb,availmem=113236400kb,totmem=114211516kb,idletime=1831143,nusers=0,nsessions=0,uname=Linux c009-n032 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64,opsys=linux

           mom_service_port = 15002

           mom_manager_port = 15003

       

      3) To run the code with 10 servers and 2 cores in every server is this the correct line?:

      >> export OMP_NUM_THREADS=2

      >> mpirun -n 10 -ppn 2  ~/myExecutable ~/trainingFileInput ~/classifierFileOutput

       

      Yesterday I could execute my program (I don't know if I correctly selected the number of servers and cores), and it finished in around 3 minutes.

       

      Today It seems it is stucked:

       

      Job ID                    Name             User            Time Use S Queue

      ------------------------- ---------------- --------------- -------- - -----

      28747.c009                 mytasksIRWLSd    u7040           00:00:00 R batch         

      [u7040@c009 ~]$ qstat -a

       

      c009:

                                                                                        Req'd       Req'd       Elap

      Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory      Time    S   Time

      ----------------------- ----------- -------- ---------------- ------ ----- ------ --------- --------- - ---------

      28747.c009              u7040       batch    mytasksIRWLSd     71078    10     20       --   06:00:00 R  00:16:35

       

       

      My intention is publishing a research paper with the results, so I need to know if I am doing everything correctly.

        • 1. Re: The correct way to launcn hybrid mpi/openmp taks
          Anju_Paul

          Hi,

           

          Please find the comments :

           

          1) PBS script looks correct.

          2) There can be two reasons to this scenario :

                    1. There can be a system level restriction imposed to ensure a fair utilization of resource. Will check if there is any such restriction imposed on the compute nodes.

                    2. The second and most probable cause is that, since you requested 10 nodes with each of them having 3 free cores, the Dev cloud might not had enough resource free, to service you request then.

          3) Will check this and get back to you

           

           

          Regards,

          Anju

          • 2. Re: The correct way to launcn hybrid mpi/openmp taks
            RobeDM

            When I request 1 node with 3 cores using this as the first line:

            #PBS -l nodes=1:skl:ppn=3

             

            I obtain the same answer:

            qsub: submit error (Job exceeds queue resource limits MSG=cannot locate feasible nodes (nodes file is empty, all systems are busy, or no nodes have the requested feature))

             

             

            So It could be a restriction.

             

            How can I know the restrictions and the correct procedure to run a hybrid mpi/openmp task in this cluster?

            • 3. Re: The correct way to launcn hybrid mpi/openmp taks
              Anju_Paul

              Hi,

               

              There seems to be a system level restriction on the ppn number.

              We are yet to get a confirmation from the concerned team on this.

              Would it be fine if you use more nodes instead of more cores/node for the time being?

               

              Also, from what i observed the command

              mpirun -n 10 -ppn 2  ~/myExecutable ~/trainingFileInput ~/classifierFileOutput

              creates 10 processes with 2 processes per node. i.e. it will use only 5 nodes.

              Hence for this you need, #PBS -l nodes=5:skl:ppn=2 , not #PBS -l nodes=10:skl:ppn=2

               

              Meanwhile, do you still get the issue of your job never finishing?

              If so, please let me know how you complied the c/c++ program to create the executable.

               

              Regards,

              Anju

              • 4. Re: The correct way to launcn hybrid mpi/openmp taks
                Anju_Paul

                Hi,

                 

                It is confirmed that there is a restriction on the ppn number.

                Each node is configured with 2-slots only. All nodes in Dev Cloud, now have 2-slot configuration.

                If there is no JupyterNotebook session, both slots will be used for batch mode.

                If there is a JupyterNotebook session, then one slot will be used for JN and the other for batch.

                 

                Regarding the mpi job running endless, please revert with answers to the following questions:

                1. Did you compile your c/c++ program in Dev Cloud to get the executable or did you copy the executable to Dev Cloud?

                2. If you compiled, did you compile with mpicc?

                 

                These

                Regards,

                Anju

                • 5. Re: The correct way to launcn hybrid mpi/openmp taks
                  Gael Hofemeier

                  Hello - was there an answer to this question? Anju_Paul

                   

                  How can I know the restrictions and the correct procedure to run a hybrid mpi/openmp task in this cluster?

                  • 6. Re: The correct way to launcn hybrid mpi/openmp taks
                    Anju_Paul

                    Hi,

                     

                    The correct procedure to run a hybrid mpi/openmp task is available in https://access.colfaxresearch.com/?p=compute

                    The access to to this documentation is obtained, when the person get access to Dev Cloud.  It looks like, it is already referred.

                    The other documentation that could be referred to, is the notes/documents in this forum.

                     

                    Will check and provide if there is any additional documentation available..

                     

                    Regards,

                    Anju

                     

                    • 7. Re: The correct way to launcn hybrid mpi/openmp taks
                      RobeDM

                      Hi,

                       

                      I already readed that link. The section tells how to use the argument -n to select the number of MPI processes. But the section doesn't tell anything about openmp.

                       

                      I have some specific questions regarding this:

                       

                      1) In openmp you can select the number of threads with the environment variable OMP_NUM_THREADS. How can I set this environment variable in the PBS script?

                       

                      2) We can not request a value of ppn in a server higher than 2. Is it the number of cores that I request in a server or the number of processors (that can have several cores)?

                        If it is the number of cores then it makes not sense to use a value in OMP_NUM_THREADS higher than 2.

                        If it is the number of processors then I would like to know the number of cores in each one to use the best value of the variable.

                      • 8. Re: The correct way to launcn hybrid mpi/openmp taks
                        Anju_Paul

                        Hi,

                         

                        Please find the comments below:

                         

                        1. export OMP_NUM_THREADS=<num_threads> that you used is correct.

                        2. ppn is the number of virtual processors per node requested for this job. The virtual processor can relate to a physical core on the node or it can be interpreted as an "execution slot" such as on sites that set the node np value greater than the number of physical cores (or hyper-thread contexts).

                        Hence, it is not processors.

                         

                        Regards,

                        Anju

                        • 9. Re: The correct way to launcn hybrid mpi/openmp taks
                          Anju_Paul

                          Hi,

                           

                          Is your query answered? Shall we close this thread?

                           

                          Regards,

                          Anju

                          • 10. Re: The correct way to launcn hybrid mpi/openmp taks
                            RobeDM

                            Just a last question.

                             

                            I can run correctly my code now. I can test my distributed code using mpi and i can set the number of threads with the variable OMP_NUM_THREADS.

                             

                            Given that ppn can be related to a physical core or can be interpreted as an "execution slot", and given that we can not use a ppn value over 2. Then it makes not sense to use a value of OMP_NUM_THREADS higher than 2. Is this true? (all my threads require high computational load and they are not performing tasks like disk reading).

                             

                            Just to be sure, because it sounds a little bit strange that I can test my code using 10 or more servers (that is quite good) but I cannot use more than 2 cores in each one.

                            • 11. Re: The correct way to launcn hybrid mpi/openmp taks
                              HemanthK

                              Correct. The "execution slots" are restricted to 2.

                              You can also see it as the number of cores blocked from use by other users, i.e.,ncores=ppn for shared nodes