4 Replies Latest reply on Jun 20, 2011 9:06 AM by tedk

    RCCE_DCMflush behavior




      I am using the definitely cached memory and having some trouble with the flushing of the cache. My method is described as below:


      1) All the cores synchronize at a barrier. All cores determine the start of the shared memory.

      2) UE 0 allocates shared memory using a RCCE_shmalloc.

      3) All cores other that UE 0 block on an RCCE_recv from the UE before them.

      4) UE 0 writes something into the shared memory allocated. Calls RCCE_DCMflush.

      5) Send the offset of the shared memory location to the next UE.

      6) The receiver UE gets the offset. Calls a RCCE_DCMflush.

      7) Reads the shared memory location. Processes the data by writing something. Calls RCCE_DCMflush.

      8) Goto step 5 if you are not the last UE.


      The behaviour I am seeing is that the Flush seems to be failing at some time and not always at the same UE. The writes that I do to shared memory are not getting flushed.

        • 1. Re: RCCE_DCMflush behavior

          Please look at bug 195. http://marcbug.scc-dc.com/bugzilla3/show_bug.cgi?id=19


          We are havng difficulty with this flush routine. Our latest version of rckmem.c (called rckmem_005.c) is posted on that bug. We've been testing that and not seeing errors but the latest comment on the bug is discussing a flaw.


          You can use the latest version we have by downloading rckmem_005.c from the bug, renaming it to rckmem.c, putting it in your linux src tree, and building a new linux. If you are using the rckmem.c currently in SVN you will see problems.

          • 2. Re: RCCE_DCMflush behavior

            Ok. Let me try the new rckmem.c over the weekend and test it. In case I am still having some trouble, I'll post my program here and maybe your could try it out as well on another machine.




            • 3. Re: RCCE_DCMflush behavior

              I am still having trouble with the flush to shared memory after compiling a new image with the modified rckmem.c. But, if I make each core allocate a dummy array in its private memory (of the same size as the shared memory) and then write and read back from this dummy array immediately after changing shared memory and calling RCCE_DCMflush, things are getting properly flushed to the shared memory.


              Do you want me to post the program that I am using so that you can try it out as well?

              • 4. Re: RCCE_DCMflush behavior

                Yes, of course ... but please join the CC group at bug 195 and post there... which is where the real nitty-gritty testing of the flush routine is taking place.



                Our latest rckmem_005.c has not been showing errors. But it's possible your test is picking up a test case we missed.