8 Replies Latest reply: Jan 10, 2011 1:05 PM by ohntz RSS

    Running RCCE v1.0.13 emulator issues

    ohntz

      Hi,

       

      Before running my experimental programs I am trying to make the "PINGPONG" example work, as SCC_Platform_Overview suggests.

      But I encount problems.

       

      I am not filing a bugzilla (yet) because I don't know if these are bugs or maybe I'm doing something wrong.

       

      The problems:

      (1)

      I get an error on rccerun, line 155, saying "pingpong.exe command not found".

      When I add "./" at the beginning of the line, it runs without errors.

       

      (2)

      Running "./rccerun -nue 2 -f ../../hosts/rc.hosts pingpong.exe" does nothing. It just writes the command line and exits.

      Debugging a little, I found that having RCCE_send() or RCCE_recv() anywhere in the code - prevents the IO from performing any printf() elsewhere.

       

      (3)

      I wrote the simplest Hello World program in the world:

       

      int main(...) {

           printf ("Hello World\n");

      }

       

      I run it:

      ./rccerun -nue 2 -f ../../hosts/rc.hosts try.exe

       

      The output:

      Hello World

      Hello World

       

      Whenever I run it again, I get:

      Hello World

      (printed only once!)

       

      It continues happening until I re-build my program.

       

      I hope that anyone can help me. I really want to get over the first stages.

       

      Thanks in advance,

      Ohn

        • 1. Re: Running RCCE v1.0.13 emulator issues
          tedk

          I tried the emulator for 1.0.13 on Linux with gcc (actually g++ but on my Ubuntu it's the same).

          I see it working fine, but I had to add the -emulator switch to rccerun. Now I know that the docs say that -emulator is the default when you buld with the emulator, but rcce 1.0.13 is an old version. The emulator is broken in the trunk, and that is the main thing holding up rcce 1.1.

           

          I have not yet tried the emulator on a non-Linux platform. I understand that you are using Win 7/cygwin. I don't have cygwin installed on my laptop, but if this is still an issue with you, then I will.

           

          Here's how I ran the "simplest Hello World in the world." I'm not seeing that strange issue you've seen with "Hello World" being printed only once.

           

          tkubasx@marc101:~/sandbox/RCCE_V1.0.13/apps/HELLO$ rccerun -emulator -nue 2 -f ../../hosts/rc.hosts hello
          hello 2 0.533 00 01
          hello from RCCE
          hello from RCCE
          tkubasx@marc101:~/sandbox/RCCE_V1.0.13/apps/HELLO$ rccerun -emulator -nue 2 -f ../../hosts/rc.hosts hello
          hello 2 0.533 00 01
          hello from RCCE
          hello from RCCE
          tkubasx@marc101:~/sandbox/RCCE_V1.0.13/apps/HELLO$ rccerun -emulator -nue 2 -f ../../hosts/rc.hosts hello
          hello 2 0.533 00 01
          hello from RCCE
          hello from RCCE
          tkubasx@marc101:~/sandbox/RCCE_V1.0.13/apps/HELLO$ rccerun -emulator -nue 2 -f ../../hosts/rc.hosts hello
          hello 2 0.533 00 01
          hello from RCCE
          hello from RCCE

          tkubasx@marc101:~/sandbox/RCCE_V1.0.13/apps/HELLO$

           

          I can also run pinpong.

          I built rcce as follows ....

             ./configure emulator

             ./makeall

             cd apps/PINGPONG

             make

           

          The I ran pingpong as

          tkubasx@marc101:~/sandbox/RCCE_V1.0.13/apps/PINGPONG$ rccerun -emulator -nue 2 -f ../../hosts/rc.hosts pingpong
          pingpong 2 0.533 00 01
          32  0.000000739
          32  0.000000735

          :

          :

          2965792  0.001312684
          3526944  0.001710459
          tkubasx@marc101:~/sandbox/RCCE_V1.0.13/apps/PINGPONG$

           

          When I do not put the -emulator switch in, I see a hang.

          • 2. Re: Running RCCE v1.0.13 emulator issues
            tedk

            Actually not a hang ... I get the following error but it just took a long time.

             

            tkubasx@marc101:~/sandbox/RCCE_V1.0.13/apps/HELLO$ rccerun  -nue 2 -f ../../hosts/rc.hosts hello
            pssh -h PSSH_HOST_FILE.7236 -t -1 -p 2 /home/tkubasx/sandbox/RCCE_V1.0.13/apps/HELLO/mpb.7236 < /dev/null
            [1] 15:23:15 [FAILURE] rck00 Exited with error code 255
            [2] 15:23:15 [FAILURE] rck01 Exited with error code 255
            pssh -h PSSH_HOST_FILE.7236 -t -1 -P -p 2 /home/tkubasx/sandbox/RCCE_V1.0.13/apps/HELLO/hello 2 0.533 00 01 < /dev/null
            [1] 15:23:36 [FAILURE] rck00 Exited with error code 255
            [2] 15:23:36 [FAILURE] rck01 Exited with error code 255
            tkubasx@marc101:~/sandbox/RCCE_V1.0.13/apps/HELLO$

            • 3. Re: Running RCCE v1.0.13 emulator issues
              ohntz

              I think we are running different rcceruns.

               

              I'll describe what I do different:

               

              1. As the document suggests - I copied rccerun from   /RCCE_V1.0.13/   to the running directory. I can't just type rccerun anywhere and run it.

              2. Then I run it with "./rccerun -nue 2 -f ../../hosts/rc.hosts pingpong.exe".

              3. I have to state "pingpong.exe" instead of "pingpong".

              4. It fails, and I have to go to rccerun, line 155, and add "./" before "$EXE $ARGS". Then it runs and finishes but prints nothing.

               

              If not tried, can you please try using   /RCCE_V1.0.13/rccerun   , without any "intel" definitions you might have in your login, and see if still no errors?

               

              Thanks Ted, you help a lot!

              • 4. Re: Running RCCE v1.0.13 emulator issues
                tedk

                I redid the example and ensured that I was using the same rccerun. You are correct ... I may have been using a more recent rccerun. But I don't see any errors using the 1.0.13 rccerun ... as shown below. Did you put rccerun in your path?

                 

                Of course, this is Linux not Win7/cygwin. We'll set up a Win7/cygwin system here locally to test the emulator on, hopefully by next week. We haven't seen reported problems with the emulator, but then most people use it on Linux.

                 

                $export PATH=.:/home/tkubasx/sandbox/RCCE_V1.0.13:$PATH
                $echo $PATH
                .:/home/tkubasx/sandbox/RCCE_V1.0.13:/home/tkubasx/bin:.:/home/tkubasx/bin:/home/tkubasx/sandboxJF/rcce:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/opt/sccKit/1.3.0/bin
                $cd sandbox
                $cd RCCE_V1.0.13/
                $ls
                apps  build_stress_test  configure  hosts    makeall   man      README           sourcing  utils
                bin   common             COPYING    include  Makefile  rccerun  run_stress_test  src

                 

                $which rccerun
                /home/tkubasx/sandbox/RCCE_V1.0.13/rccerun

                $rccerun -nue 2 -f ../../hosts/rc.hosts hello
                hello 2 1.0 00 01
                hello from RCCE
                hello from RCCE

                I can run this many times and see the same result.


                $rccerun -nue 2 -f ../../hosts/rc.hosts pingpong
                pingpong 2 1.0 00 01
                32  0.000000740
                32  0.000000736

                :

                2493920  0.001060648
                2965792  0.001343287
                3526944  0.001673127

                $

                • 5. Re: Running RCCE v1.0.13 emulator issues
                  ohntz

                  I further investigated my problem:

                   

                  PINGPONG works for me only when I drastically reduce the size of array "buffer[]" in RCCE_pingpong.c (from the default [1024*1024*4] to let's say [32*32*4]).

                   

                  Bigger values for "buffer[]" or "nrounds" make the program very slow.

                  BIG numbers (like the default 1024*1024*4) make pingpong exit instantly after running it.

                   

                  How do you explain that?

                   

                  Also, what is buffer[] anyway? Which memory allocates these 4MB?

                   

                   

                   

                  About the single/double printing of "Hello World" -

                   

                  The phenomenon has ceased to occur. I don't know what I did that fixed it. I did not touch the .c or .exe.

                  It's strange. Thanks for trying it for me though.

                   

                  Ohn

                  • 6. Re: Running RCCE v1.0.13 emulator issues
                    cscholtes

                    I recently encountered similar issues with pingpong:

                    Running pingpong from the shell (/bin/bash) succeeded.

                    Running it from within a makefile failed.

                     

                    The reason was, that make uses /bin/sh as shell which

                    (by default) provides different amounts of memory.

                     

                    The memory for buffer[] is taken from the stack (once per UE).

                     

                    I don't know how to set maximum stack size for sh

                    but for bash one can use:

                     

                      ulimit -s <stack size in kB>

                     

                    Setting the stack size to (a bit more) than

                    4MB per UE should be enough to make pingpong run.

                    The tool ulimit also has the option of setting stack size to "unlimitted"

                    but that actually appeared to provide less memory

                    (It let pingpong fail). The default stack size may depend

                    on some system specific setting: I you are using a different shell

                    you will probably have to use a different tool

                    to set the stack size large enough...

                    • 7. Re: Running RCCE v1.0.13 emulator issues
                      tedk

                      Ohn,

                      Is this what you are doing ... running pingpong from within the Makefile?

                       

                      If I add the rccerun line to the Makefile as follows ...

                       

                      pingpong: $(PINGPONGOBJS) Makefile
                              $(CCOMPILE) -o pingpong $(PINGPONGOBJS) $(CFLAGS)
                              rccerun -nue 2 -f ../../hosts/rc.hosts pingpong

                       

                      then I get the error

                       

                      g++ -o pingpong RCCE_pingpong.o  /home/tkubasx/sandbox/RCCE_V1.0.13/bin/OMP/libRCCE_bigflags_nongory_nopwrmgmt.a -O3 -fopenmp    -I/home/tkubasx/sandbox/RCCE_V1.0.13/include
                      rccerun -nue 2 -f ../../hosts/rc.hosts pingpong
                      pingpong 2 1.0 00 01
                      /home/tkubasx/sandbox/RCCE_V1.0.13/rccerun: line 183: 10934 Segmentation fault      $EXE $ARGS
                      make: *** [pingpong] Error 139

                      If I take out the rccerun from the Makefile and run it from the command line, it works fine.

                      • 8. Re: Running RCCE v1.0.13 emulator issues
                        ohntz

                        Hi Carsten and Ted,

                         

                        It seems that Cygwin default stack size is 2048KB and this attribute cannot be changed using commands from the shell.

                         

                        One can, however, compile with a different stack size, using:

                        g++ -Wl,--stack,8000000

                        (for 8MB stack)

                         

                        I added this option to /common/symbols.in, re-makeall and pingpong worked with the large buffer.

                         

                        All is clear here now.

                         

                        Thank you very much.

                         

                        Ohn