Can you be specific about what you want to modify the LUT entries to?
This is not something most users (at least currently) want to do ... hence the TBD. But if you describe what you want your modification to the LUT tables to accomplish, I can work with you on how you might accomplish that. Our results might then flesh out that TBD.
Have you looked at the EAS? http://communities.intel.com/docs/DOC-5365
With the information detailed there you should be able to dynamically program the LUTs. The Appendix gives the default mappings used for Linux that should give examples.
Jim is correct. If you want to consult about modifying the LUTs dynamically, we can. Putting together a running example that we can post on this site would be a good contribution to the community.
It's also possible to modify the LUTs at startup ... that is, boot SCC Linux with a different set of values. How to do that, I think, needs some further documentation.
Actually, I'm investigating both approaches (at startup - let's call it "static" - and on-the-fly or "dynamic"). I was just a bit confused at the beginning regarding the correct settings of the route, subDestId and addrDomain bits to address the respective MCs, however after going over the EAS I believe I figured this out now, except for the bypass bit (the meaning of which is still not entirely clear to me - but it's not set for the default LUT table entries pointing to private memory anyway, and I'm mainly interested in those).
I haven't looked into dynamic changes any further up till now, since static changes seemed easier and dynamic changes require the memory contents to be moved manually together with the reassignment of the private memory space, but I'd like to explore this further.
For the approach of changing the LUTs statically, I'm simply modifying /tmp/sccKit_username/linux.mt and running sccMerge afterwards. Is this the correct approach or is there a better way of doing this? I've also run into some limitations, e.g. I can't assign a single MC to more than 31 cores because fopen() (used to open mch_0_0.32.obj, in sccExtApi.cpp) is unable to handle files larger than 2 GB (since sccKit is compiled in 32 bit), and 31/32 cores together with the default linux image seems to be the border of mch_0_0.32.obj getting bigger than 2 GB. Maybe this should be filed as a bug and the source code changed to use fopen64()?
With MC0 assigned to a number of cores between 31 and 13 however, sccBoot completes its operation without any errors or warnings, but the cores are not reachable via ssh, so there is still some other problem the cause of which I'm still unaware of. If the number of cores is small enough however, e.g. the standard 12 or below, it works perfectly. My modified linux.mt looks like this:
# pid mch-route mch-dest-id mch-offset-base testcase
0x00 0x00 6 0x00 /tmp/sccKit_philippg/linux.obj
0x01 0x00 6 0x01 /tmp/sccKit_philippg/linux.obj
0x02 0x00 6 0x02 /tmp/sccKit_philippg/linux.obj
0x17 0x00 6 0x17 /tmp/sccKit_philippg/linux.obj
Maybe it's some error in the offset?
Look at bug 46 in the Bugzilla database. We no longer recommend setting the bypass bit. Also, this is mentioned in our SCC errata, http://communities.intel.com/community/marc/sccerrata
We'll look into your other issues.
ad bypass bit: Thanks for mentioning!
ad ssh: Well, my most basic check to see if the cores are operational is ssh'ing to them, but yes, this also means without bind9 involved (I'm aware of the binding issue). Therefore also no echoes when using ping. I rarely use sccKonsole or any GUI tools due to the long RTT of our connection from Europe.
Note that I have sccKit 1.3.0.
Yes, I think modifying the linux.mt is what you want to do. It looks as if what you want is all cores going through one memory controller (from your example), namely the one at 0x00. The subdest is 6 because this memory controller is on the West side. Is this your intention? I made a linux.mt that I think does this. I attached it. Does it differ from yours?
With that linux.mt, I issued the command ...
sccMerge -m 8 -n 48 -noimage linux.mt
This loads Linux on the 48 cores. The file arguments.txt leads me to believe that I now have all cores going through the MC at 0x00 because cores_per_mch=48. But I want to verify this. I haven't tried to change any LUT entries yet.
Basically what I want to do is check with you that this is what you want ... all cores going through one mem controller. Can you also be more specific about what you want to change the LUT entries to?
Yes, sorry, I wasn't too clear about my intentions: I'd like to experiment with different memory-controller-to-core assignments. So I'd like to "move" cores to different memory controllers (e.g. try the standard setup; use only one MC for all 48 cores; use two MCs as in 24 cores per MC, use two MCs with 36/12; etc...). This is something I'd definitely want to do on startup (e.g. boot the system with different core-MC assignments), so for this no on-the-fly LUT modifications are necessary, but later I'd also like to try dynamic modifications while the system is running (which is harder since basically you have to move all your data at the same time). The latter is more or less optional, for me it's more important right now to get the first variant working.
At the moment I believe there are two open issues: the file size limitation of sccBoot (or rather a subfunction of sccExtApi that is used by sccBoot) that prevents me from booting more than 31 cores per MC (i.e. sccBoot does not complete the operation) and the fact that I can't seem to successfully boot with a variable number of cores higher than 12 assigned to a single MC (i.e. sccBoot completes the operation without errors but the cores do not seem to be operational) - I'm unsure whether the latter is a problem of sccMerge or sccBoot (or me...).
> With that linux.mt, I issued the command ...
> sccMerge -m 8 -n 48 -noimage linux.mt
> This loads Linux on the 48 cores.
Well not exactly. As far as I understood it, sccMerge just creates a single memory object file from the *.obj file(s) specified in linux.mt together with other relevant information contained in linux.mt (i.e. on which memory controller do we plan to place what image with what offset). It does not actually load anything into the memory yet.
This also completes without any errors - but did you try booting from this image with sccBoot? This is not possible if you use the standard linux image supplied by you guys if you intend to use more than 31 cores with a single MC, since the memory object for this MC then exceeds a file size of 2 GB and the 32-bit version of sccKit is unable to handle such large files at the moment.
Upon issuing the booting command, sccBoot calls sccExtApi::writeMemFromOBJ() in the file sccKit_V1.3.0/sccGui/src/sccExtApi.cpp. This function uses fopen() to open a file descriptor to the memory object (sccExtApi.cpp, line no. 85), and if the whole thing is compiled in 32 bit (which the binaries that are available seem to be), 2 GB is the maximum file size you can get a file descriptor for. For files larger than that, the call does not return a valid fd. And with the standard linux image being ~65.46 MB large, 31 cores per MC lead to a 2029 MB file and 32 cores per MC lead to a 2094 MB file.
So to my knowledge booting the cores using the memory image created by your sccMerge command with -n 48 should fail (unless you're using a linux image smaller than ~42.66 MB). Our MCPC is still using sccKit 1.2.3, but will be updated to 1.3.0 in a couple of hours. I will try again afterwards, but after looking at the source code of version 1.3.0 I expect the problem to remain if the binaries are compiled in 32 bit.
Other than that your linux.mt looks just like mine.
But now to the second issue: Can I ask you to try booting e.g. 14 cores with the MC at 0x00? I tried this using your linux.mt and simply deleting everything after core id 0x0d. I then issued
sccMerge -force -noimage -m 8 -n 14 ./linux.mt
sccBoot -g ./obj
and both commands complete successfully, however the cores do not seem to be operational. Can you verify this?
Booting 48 instances of the (default) Linux image on a single MC can not work because the Linux kernel assumes that it has 320MB of private space. As we "only" have 8GB per MC, the Linux instances won't have enough private space when they share a single MC (they will start to overwrite the other kernels private sections with their own stack etc.). It's possible to do this with a customized Linux image that only consumes 160MB of memory. With the current default image you can only boot a maximum number of 24 instances on a single MC.
I see now that Michael is correct about not being able to put 48 cores through one memory controller.
I think the basic procedure for smaller numbers of cores is 1) create an mt file 2) use sccMerge to create the obj directory with its contents 3) boot with sccBoot -g and 4) release the resets on the booted cores. I'm having some mixed results with this procedure ... sometimes after releasing the resets I cannot access the cores. Are you seeing similar behavior?
Phillipp, I did try doing the sccMerge on 14 nodes as you suggested. It seemed to load but the cores will not respond to a ping. Michael, can you see something wrong in what I did. I attached a file showing my commands.
boot14.txt.zip 3.3 K
sccMerge partitions the available memory of each MC (in our case -m 8) based on the number of cores (-n option)... Thus the position of the shared memory on the MC for 12 cores is 0x1ec while the SHM location for 14 cores is at 0x1ea. You can check this offset (after pre-loading the LUTs) with the following command:
# sccDump -c 0 | grep "Entry 0x80"
LUT0, Entry 0x80 (CRB addr = 0x0c00): 0_0x00_6(PERIW)_0x1ea
LUT1, Entry 0x80 (CRB addr = 0x1400): 0_0x00_6(PERIW)_0x1ea
The Network driver (part of crbif) has a parameter that defines the offset to the shared memory. This offset is 0x1ec200000 by default. It needs to be changed to 0x1ea200000 in order to work with the 14 core setup:
# echo 0x1ea200000 > /sys/module/crbif/parameters/txBaseAddress
I have to bring this up again since it still does not work properly. I am trying to boot 14 cores "attached" to the first memory controller, MC0, and while this works for the majority of the cores, it is never the case that all of them are reachable afterwards via ping/ssh. Furthermore if I powercycle the hardware (which I know is not necessary for this procedure, but still...) the cores that are unreachable can change. E.g. After one run, 3 cores are unavailable, after a powercycle and a second run 4 cores are unavailable.
My procedure is as follows:
* Create a linux.mt (attached to this post) with 14 cores "attached" to MC0, using the standard linux image provided by Intel
* sccMerge -m 8 -n 14 -noimage -force ./linux.mt
* sccReset -g
* sccBoot -g ./obj
* get the network offset
* grep "// 0x80" ./obj/lut_init.vi | head -1
* or alternatively sccDump -c 0 | grep -e "Entry 0x80" which obviously returns the same result, in this case 0x1ea
* set the SHM address for the network driver with echo 0x1ea200000 > /sys/module/crbif/parameters/txBaseAddress
* sccReset -r 0..13
* test the availability by pinging cores 0..13 which - in my last run - resulted in:
philippg@marc007:~$ ./status.sh 14
192.168.0.6 NO ANSWER
192.168.0.8 NO ANSWER
192.168.0.11 NO ANSWER
192.168.0.12 NO ANSWER
Total number of reachable cores: 10
Remark: status.sh is a very simple bash script that simply pings all IP addresses from 192.168.0.1 to 192.168.0.N where N is the commandline parameter (14 in this case). I use this instead of sccBoot -s since sccBoot -s does not return the status of the cores correctly.
linux.mt.zip 282 bytes