I have a shared memory application which I would like to accelerate using SCC. I have been looking at the documentation, and I get the impression that POP-SHM looks like the best approach for this (I understand that RCCE provides some support for shared memory, but am I right that POP-SHM offers a cleaner solution?)
Thanks to this post:
I have the tarball:
which I believe is an implementation of POP-SHM. What would really help would be an exmaple application that uses this API, with instructions on how to get the application running across all cores of SCC.
Can anyone tell me whether some such tutorial application exists?
...a more basic question:
Is there any way to run my shared memory application, which uses fork to create processes, and shmget etc. to create shared memory between processes, directly on SCC?
Ideally I would just recompile this application and try running it across all of the SCC cores. I get the impression this isn't possible, but thought it would be worth double-checking!
I would suggest you start by looking at the main.c file in the small project you downloaded (popshm-20110126 tarball). It will compile into an excecutable called 'test', since it is basically a test used during POP-SHM implementation. What is of most interest for development, is the different outputs that give you the internal state of the library, and the memcpy interface.
To my knowledge, the only "real" use of this library so far is in RCKMPI.
You can use POP-SHM as a stand-alone solution. However, due to the higher latencies of the off chip memory, it may be desirable to use RCCE or the test and set registers for synchronization (you are also free to develop your own protocol as is the case in RCKMPI).
After you look at and build the main.c file from that tarball, I can help you with any questions regarding that file of the use in RCKMPI.
PS: There is also a user manual available, not sure if Ted already posted it.
Edit: Here is the user manual:
The POP-SHM and RCK User's Guide are posted on the front page of this web site under Documentation. The POP-SHM User's Guide was there for some time, but I just put the RCK MPI User's Guide there. Bug 138 ( http://marcbug.scc-dc.com/bugzilla3/show_bug.cgi?id=138 ) tracked the RCK MPI beta. The RCK MPI User's Guide was avaialable as an attachment to the bug, but I couldn't find it on this site.
If you are using RCCE with POP-SHM, please be careful not to use the RCCE shared memory calls. They have hijacking knowledge embedded in them. I think, however, that, as Isaias explained, you can use a RCCE built with SHMADD or SHMADD_CACHEABLE. I had cautioned against that previously but after reading Isaias's post, I think now it is OK. The LUT mapping that occurs when RCCE is built that way should not affect how POP-SHM operates. I think a RCCE interface to POP-SHM is desirable, but does not yet exist ... you still can use RCCE with POP-SHM.
The popshm tar file on our SVN is not an implementation of POP-SHM but rather a simple example of using it ... a place to start. You should ensure that you load a Linux with POP-SHM enabled. You can check by logging onto a core and issuing cat /proc/meminfo |grep POP and you should see
POPSHM pages: 1
POPSHM page size: 16384 kB
POPSHM buffer size: 16384 kB
POPSHM base address: 0x10000000
This example shows one default POPSHM page. This default value comes from the binary file commandline.bin which is used when you build SCC Linux. Here is how that file looks when you look at it with bvi. You can change the value of popshmpages to 2 or 3. I don't think 4 is currently supported, but Isaias would know for sure. Note that this is the default value .... you can change the value dynamically with a POPSHM call as described in the POPSHM docs.
00000000 20 20 20 20 20 20 20 20 20 20 20 6E 6F 2D 68 6C no-hl
00000010 74 20 70 6F 70 73 68 6D 70 61 67 65 73 3D 31 00 t popshmpages=1.
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
I understand that to use POPSHM I need to boot a patched Linux on the SCC cores.
However, I cannot get this to work. I have checked out the MARC repository, and would like to boot "linux_dcm_02_23_11.obj" which is under "trunk/CustomSCCLinux" (please let me know if that's not the right version to be booting)
I have tried:
sccBoot -l 0..7 -g ~/marcrepository/trunk/CustomSCCLinux/linux_dcm_02_23_11.obj
but this gives an error:
ERROR: Please don't mix -l, -s and -g options... Aborting!
(I posted a separate discussion about this, as it appears to be a more genearl issue to do with sccBoot)
I then tried just to use -g:
sccBoot -g /home/ally/marcrepository/trunk/CustomSCCLinux/linux_dcm_02_23_11.obj
(the long path is exactly where this .obj file is on my system)
This leads to the following:
WARNING: Please provide valid obj-path (given path doesn't contain memory image)
Am I doing something wrong, or could it be that the .obj file in the repository is invalid?
Many thanks for any help
What version of sccKit are you using? Please look at the README.txt in http://marcbug.scc-dc.com/svn/repository/trunk/CustomSCCLinux/README.txt
The file linux_dcm_02_23_11.obj is intended for use with sccKit 1.3.0.
If you are using 1..4.0 (you should be BTW), the file you want to load is BuildSCCLinux417_002.obj. This also enables eMAC. If you cd to the directory containing custom linux object, try the following. /proc/meminfo will tell you if POPSHM is enabled.
$ sccBoot -l BuildSCCLinux417_002.obj 0..47
INFO: Welcome to sccBoot 1.4.0 (build date Mar 21 2011 - 18:39:01)...
INFO: Starting to boot Linux: All cores!
INFO: Using linux image "./BuildSCCLinux417_002.obj" (provided by user)...
INFO: Creating .mt file "/tmp/sccKit_tekubasx/linux.mt"...
INFO: Copied object ./BuildSCCLinux417_002.obj to faster local disc (/tmp/sccKit_tekubasx/)...
INFO: Merging objects with sccMerge:
INFO: -> sccMerge -broadcast -m 8 -n 12 -noimage linux.mt
INFO: Pulling resets and enabling L2 caches: All cores!
INFO: Preloading Memory with object file...
Loading content of file "/tmp/sccKit_tekubasx/obj/mch_0_0.32.obj" to DestID PERIW of Tile 0x0 (RC port)...
INFO: writeMemFromOBJ(...): Configuration of memory done!
INFO: Preloading LUTs...
INFO: Configuring LUTs with content of file "/tmp/sccKit_tekubasx/obj/lut_init.dat"...
Configuring LUTs with content of file "/tmp/sccKit_tekubasx/obj/lut_init.dat"...
INFO: -> Configuration of LUTs done!
INFO: (Re-)configuring GRB registers...
INFO: Releasing resets...
INFO: Linux has been started successfully. Cores should be reachable via TCP/IP shortly...
$ sleep 60
$ ssh root@rck00
root@rck00:~> cat /proc/meminfo |grep POP
POPSHM pages: 1
POPSHM page size: 16384 kB
POPSHM buffer size: 16384 kB
POPSHM base address: 0x10000000
It looks like I'm using sccKit 1.3.0 - when I try the command you recommended I get:
INFO: Welcome to sccBoot 1.3.0 (build date Aug 25 2010 - 15:55:06)
I'll look at installing sccKit 1.4.0. Do you know offhand if I can do this without root access to my datacenter?
Yes, you need to be root to install 1.4.0. If you are using a Data Center system, we can do that for you.
We've been requiring approval from the PI for this. Why? 1.4.0 affects performance ... in a positive way. So if you are doing a series of performance runs, you don't want to change your baseline midstream.
Installation takes anywhere from 30 minutes to a day. It seems to depend on the quality of the hw eMAC ports. Majority of installations take about an hour and we schedule two hours unless it's one of the unfortunate cases.
It's more than software. If we enable one eMAC port we add another ethernet cable, two cables if we enable two ports.
I think the most efficient way to proceed is to create a Bugzilla under "Marc Administration Needed." And your PI (are you the PI?) sould send mail to SCC Research Proposals (but right now my email cannot even expand that folder; it used to be able to). If the PI could make a comment on the Bugzilla entry, we can forward it to the right place.
Is it essential to use 1.4.0 with POPSHM? I don't think so, but I've always used it with 1.4.0. We don't really have any remaining 1.3.0 internal systems to work with. Note also that 1.4.1 is about two weeks away. The sccGui for 1.4.1 has a POPSHM dropdown menu.
It's possible to run 1.4.0 without eMAC. Some users do, although honestly I think you should take advantage of the eMAC performance improvement.
Thanks to your help I have managed to get POP-SHM working for a simple program running on one core.
I now wish to move on to a multicore application, where I will synchronise between cores.
If I understand correctly what you have written above, is it right that the best thing is to write a non RCCE application, but to call into RCCE to do synchronization?
Having looked a bit at RCCE, what I am planning is this:
Use RCCE to launch a bunch of executions on multiple cores, have the cores allocate some flags via the RCCE api, and then hit a barrier to make sure all flags are allocated.
From there on, I will not use RCCE at all, except to write to and read from flags. Am I correct that flag operations are atomic?
At the end of the application, the cores will hit a barrier and then free all the flags.
Does this sound along the right lines?
You're breaking new ground here. Which is good.
Writing a RCCE application to do POPSHM is just fine. What I was trying to say was that the RCCE-specific shared memory calls use memory hijacking. Like RCCE_shmalloc(), etc. So I don't think you should use those when doing POPSHM. But the POPSHM calls should work.
What we'd like to do is rewrite the RCCE shared memory calls to know about POPSHM instead of memory hijacking. Hijacking was a hack that we used before POPSHM was available. We want to "unhackify" RCCE.
But no one has had the time to do that yet. We want to. If you want to become a RCCE contributor, the opportunity is here! Note that RCCE_init() modifies the LUT for memory hijacking. That modification should not affect POPSHM.
I'm not clear about what you mean by "not using RCCE." RCCE applications (they are linked with the RCCE library and have a RCCE_APP() call) are loaded through rccerun. Non-RCCE applications are loaded with pssh. rccerun is a script around pssh. It does some extra stuff ... like initialize the MPB, etc. If I were trying out POPSHM with the current RCCE I would write a RCCE application and be careful not to use the RCCE memory hijacking feature.
Then, I'd want the RCCE shared memory facility to use POPSHM when I got serious.
BTW, the best RCCE to use is from the trunk, not the tag. The only reason we haven't tagged a new RCCE release is because the emulator in the trunk is broken. The emulator was very useful when we did not have hardware but has since fallen behind in rev. Work focused on RCCE running on hardware. Some people still say the emualtor is useful ... it's easier to use a debugger with it, for example.
Regarding becoming an RCCE contributor - perhaps! I'll see how I go first with the application I'd like acclerate :-)
I'm having trouble combining RCCE and POP-SHM due to duplicate symbols.
I wrote a very simple POP-SHM program for one core, which allocates locally arrays of 1024 integers, A and B, fills up array A with some data, then copies it to B via shared memory, and checks that B has the same data. So, essentially, I simulate:
memcpy(B, A, 1024*sizeof(int));
popshm_memcpy_put(0, A, 1024*sizeof(int));
popshm_memcpy_get(B, 0, 1024*sizeof(int));
This works just fine.
Now I am trying to do the following:
ME = RCC_ue();
if(0 == ME)
// My original program, using POP-SHM
In other words, as an initial sanity check, I just want to make sure I can use a load of cores with RCCE, and have just one of them (the one with ue == 0) run my original program. After this, I plan to try out inter-core synchroniztion using the RCCE API, and then eventually get on to the application I really care about.
The problem I am having is as follows. For my original program, just with POP-SHM, I found I had to link to both popshm.o and scc_api.o, which I compiled from popshm-20110126.tar. This was fine. Now, when I use RCCE as well, I clearly need to link in the RCCE libary - I'm doing this by linking to libRCCE_bigflags_nongory_nopwrmgmt.a, which I also compiled (and which works for the PINGPONG example in rcce/apps).
It seems that libRCCE_bigflags_nongory_nopwrmgmt.a and scc_api.o include a lot of common symbols, so I get linker errors:
/home/ally/marcrepository/trunk/rcce/bin/SCC_LINUX/libRCCE_bigflags_nongory_nopwrmgmt.a(SCC_API.o)(.text+0x0): In function `InitAPI':
: multiple definition of `InitAPI'
../popshm-20110126/scc_api.o(.text+0x0): first defined here
ld: Warning: size of symbol `InitAPI' changed from 159 in ../popshm-20110126/scc_api.o to 120 in /home/ally/marcrepository/trunk/rcce/bin/SCC_LINUX/libRCCE_bigflags_nongory_nopwrmgmt.a(SCC_API.o)
(and lots more similar ones)
Can you advise on this? E.g., could I just remove some of the implemenations of these functions from SCC_api.c? I hesitate to do so because the symbols have different sizes, as indicated in the above warning.
I attach the C file for my example program, for reference.