1 of 1 people found this helpful
Do you have your own MCPC/SCC? Why are you installing sccKit 1.3.0. I think if you follow the instructions to configure the MCPC, you will end up with sccKit 1.3.0, but I recommend that you upgrade to sccKIt 18.104.22.168. The newer sccKIt comes with a newer SCC Linux. This newer SCC Linux does support insmod. The SCC Linux with 1.3.0 does not.
We have our own MCPC/SCC in institute. We installed 1.3 one year ago. I am now working with it, but now we decided to upgrade it next week.
However i need vmlinuz or system.map file to install BLCR. Is it possible to find them somehow?
I tried to use another Checkpointer "DMTCP".
First i sourced the Cross Compilation Environment script and I typed: ./configure --prefix=/shared/**/ then make and make install. The installation was successful.
To checkpoint a process in DMTCP, you must first start the dmtcp_coodinator in a separate terminal window. It works successfully too.
But if i start a program with checkpointing support (dmtcp_checkpoint ./myprogram ) , i get the following error:
ERROR: ld.so: object '/shared/***/lib/dmtcp/dmtcphijack.so' from LD_PRELOAD cannot be preloaded: ignored
This error generally means that there is bit incompatibility. But I have already installed it with the Cross Compiler.
I checked the file dmtcphijack.so, it is 32 Bit.
What can be the cause of this error message?
Do you build your own Linux? If you do you can see vmlinux and System.map in buildroot-2011.05. Look in http://communities.intel.com/docs/DOC-6869 for information about how to build SCC Linux.
$ find . -name System.map -print
$ find . -name vmlinux -print
The typical way people add packages to SCC Linux is to modify its buildroot environment. There's a menu that we use to select features. I don't see the ability to add checkpointing in these menus. You should be able to add features to those menus. I don't know how to do that, but someone here must. It means understanding buildroot ... there's nothing SCC specific about it. http://buildroot.uclibc.org/
Actually I'm surprised that our cross environment supports configuring and building DMTCP. So that may be another way of getting checkpointing support.
I have not used DMTCP. I assume you just start the "dmtcp_coordinator in a separate terminal window," that is ... another ssh connection into a core. How do you know it's working? Then you attempt to start a program with checkpointing support and get an error. Where do you run the dmtcp_checkpoint?
We haven't used shared libraries much with the SCC. There is a post dealing with shared libraries
I haven't come across anyone trying to checkpoint on the SCC. I'll ask around and see if I can get more information.
First I log into a core and start dmtcp coordinator, I am not sure if it works properly. I only dont get any failure and see these messages:
Checkpoint Interval: disabled (checkpoint manually instead)
Exit on last client: 0
Type '?' for help.
It seems working. These are the usual messages. I get them on my linux and can checkpoint without any problem.
After that I am logging into the same core with another terminal and trying to start a program with checkpointing support.
PS :Here is how does DMTCP internally works (very briefly): http://dmtcp.sourceforge.net/FAQ.html#internalWorking
Can you be more specific about why you want to use checkpointing. Are you interested in "automatic checkpointing " or " application-specific, user level checkpointing?" Are you doing research on checkpointing or using it as a tool for some other research?
I want to use them for process migration in MPI applications and I wonder if someone else did it.
Is there any other way for process migration in MPI applications on sccLinux?
There is certainly interest in process migration on the cores. Unfortunately there is little interest in checkpointing. Now I don't know how these are related? Can one do process migration without checkpointing? I would hope so. Checkpointing is very memory intensive, I think. It may save more state than is necessary for migration.