Here are a few question concerning RCKMPI that might be of general interest:
1.) Multiple channels: Is it possible to compile for multiple channels at once and then to select for each run separately which of the compiled channels is to be used (sccmpb, sccmulti, sock, ...)?
Configuring with multiple "--with-device" options only appears to select the last channel specified (e.g.: "--with-device=ch3:sccmulti --with-device=ch3:sock" only configures for sock). Trying to give one "--with-device" option multiple values (e.g.: "--with-device=ch3:sccmulti,sock" or "--with-device=ch3:sccmulti,ch3:sock") fails with an error about a channel that does not exist.
In case compilation for multiple channels is possible, how would one select the channel to be used? (OpenMPI's "mpirun" supports, e.g., something like "--mca btl tcp,self" ...)
2.) Reconfiguring: After having configured, compiled and installed RCKMPI for one channel, what has to be done before configuring for another channel (to be installed to a different location)?
At first sight, using "make clean" appeared to work. Is this sufficient, more than necessary or does it fit?
3.) Paths on cores: Is it possible to set paths (PATH and LD_LIBRARY_PATH) on the cores for RCKMPI and its prerequisites (in /shared/... instead of copying everything to /usr/bin and /usr/lib)? (Copying wastes quite a lot of memory and, thus, severely limits problem sizes. Additionally, copying takes quite some time, especially since it has to be done anew after each boot of the cores.)
The key problem appears to be the limited environment provided by ssh when starting the MPI daemons: Apparently, PATH is hardcoded into ssh to contain only "/usr/bin:/bin". No apparent startup script is processed (e.g.: /etc/profile).
Preparing an environment according to "man ssh" and "man sshd_config" failed, too: Create /etc/ssh/, create /etc/ssh/sshd_config with permissions 644 and contents "PermitUserEnvironment yes", create ~/.ssh/environment with permissions 700 and contents "PATH=/usr/bin:/bin:/shared/...".
4.) Switching channel: What has to be done to switch from one channel to another (e.g. sock to sccmulti)?
a) Is it necessary to exchange the rckmpi/bin copies on the cores (or are these generic)?
b) Is it necessary to exchange the rckmpi/lib copies on the cores? Probably not, since they are all static libraries (lib*.a). -> Are they needed on the cores at all?
c) Is it necessary to rebuild the MPI application? (Probably yes, since mpi functions are linked statically. Or is everything channel specific contained within rckmpi/bin/ ?)
d) Supposedly, PATH on the MCPC has to be adapted to use the tools (mpicc, ...) produced for the desired channel. LD_LIBRARY_PATH is probably insignificant?
e) A suitable kernel has to be running. Would a POPSHM enabled kernel be suitable for all channels?
5) Detect channel:
a) How can an application detect at runtime which channel it uses?
b) How can one detect at compile time, for which channel mpicc compiles?
6) Compilation for POPSHM: Obviously, RCKMPI can be compiled for channel sccmulti without having installed an extra library or even a POPSHM enabled kernel. Even the corresponding MPI daemons appear to work (at least running an application compiled for sccmpb). Will such a POPSHM targeted RCKMPI work as expected when using a POPSHM enabled kernel or should a POPSHM targeted RCKMPI only be compiled running a POPSHM enabled kernel?
I'd be glad to learn more about any of the questions above.
Thank you in advance