4 Replies Latest reply on Jan 24, 2011 10:03 AM by tedk

    RCCE version mismatch -> Segfault?

    devendra.rai

      Hello All.

       

      I am getting error code 139 (Segmentation Fault) whenever I try to call the RCCE_init(...). I thought that the problem was in my application, and then I tried running pingpong that was provided by Intel. It also segfaults when calling RCCE_init(...).

       

      I have not rebuilt RCCE after the time I first downloaded it (Sep 2010), but still I think it will be remarkable to get Segfault when the version changes. I am also sharing marc009 with another team in the US.

       

      Can someone help me narrow down the scope of error?

       

      Related is the question of debugger support on the cores. Per someone's suggestion on the discussion list I tried building strace, termcap and gdb. The first two went well, but inspite of lot of attempts, I cant get gdb to build. I also built ncurses. Also, has anyone had success in using gdb with executables built with icpc? I remember seeing gdb starting with some Intel libraries instead of main. So, I am having to resort to p****fs to debug, which is not a great way to doing it.

       

      Thanks a lot for your help.

       

      Cheers


      Devendra Rai

        • 1. Re: RCCE version mismatch -> Segfault?
          tedk

          I don't think you should see a seg fault with a binary that used to work. If you make your binary available, I can try it out ... but debugging on this system usually means editing and rebuilding. What version of RCCE was your app built with? Is there some reason why yiou cannot rebuild? I tried out a "hello world"

           

          #include <string.h>
          #include <stdio.h>
          #include "RCCE.h"

          int RCCE_APP(int argc, char **argv){

            RCCE_init(&argc, &argv);

            printf("hello from RCCE ... I am %s\n",RCCE_VERSION);

            RCCE_finalize();

            return(0);
          }

           

          and it works fine.

          rck00: hello from RCCE ... I am 1.0.13.x

           

          But then I'm using RCCE from the trunk.

          1 of 1 people found this helpful
          • 2. Re: RCCE version mismatch -> Segfault?
            devendra.rai

            Hello Ted

             

            Thanks for responding. I am attaching the folder (with sources and makefile, plus the executable I built). I run it as:

             

            rccerun -nue 2 -f /shared/devendra.rai/rc.hosts  pingpong  3

             

             

            (The frequency stuff was removed, since it was not being used).

             

            The source was annotated with some p****fs, just to see where it breaks. My version of RCCE is definitely older than 1.0.13.

             

            Thanks a lot.

             

            Devendra Rai

            • 3. [Solved] Re: RCCE version mismatch -> Segfault?
              devendra.rai

              Hello Ted,

               

              Thanks for your help. I found the problem was with the rccerun file. I had commented out ARGS="$NUE $REFCLOCKGHZ" line, since I thought that these arguments were not being used by the application anyway.

               

              Putting these back in make the app run fine. I am not really sure on how and why these arguments are needed. Will be looking into it in detail.

               

              Case closed for now.

               

              Thanks a lot

               

               

              Devendra

              • 4. Re: [Solved] Re: RCCE version mismatch -> Segfault?
                tedk

                I do think you should run with the latest RCCE, and I would recommend the trunk if you are not using the emulator. Although I think your app should work OK with an older RCCE, there is no effort here to ensure that RCCE apps work with previous versions. It's an opensource resource project, not a product.

                 

                rccerun is essentially a wrapper around pssh. I haven't parsed every line of rccerun but think the $ARGS line is necessary. $NUE specifies the number of cores you are running on. I think $NUE gets that value from the rccerun command line. I don't think there is a default. I think there is a default for REFCLOCKGHZ.