The latter is correct - the current operation mode of the SCC system is to act as 'cluster on a chip', meaning that you get 48 Linux instances with a (very very fast) network interconnect. By default, each instance works in its own private region of the global memory. If you have some larger read-only data part in your application, you could put it into a shared memory region (configured through the LUTs) accessible from all cores. The executable data itself should normally never exceed a couple of megabytes, and therefore does not really count.
By the way, the programming model (SPMD) does not mandate any kind of execution model. There are OpenMP implementations out there which support the execution on a 'share-nothing' architecture as the current SCC setup.