This content has been marked as final. Show 2 replies
I am running a parallel application on SCC and I can not get speed up result even with two cores. I could get speed up if I run the same program on a regular multi-core machine. The algorithm sclaes with up to 8 cores on a regular multi-core machine.
After tracing the mpi messages, I have found out that I issue around 50000 BLOCKINg receives (sens are not blocking) and around 200 of these receives are too costly on SCC. the receiving times of these 200 messages are up to 1000 times slower than other rmessages and I believe these are the messages that cause the problem. There is nothing specific about these messages. I checked the receive times of the same messages on a dual core machine and found out that their recive time is the same as other messages on the dual core machine.
I just wonder if anybody else has experinced such a performance issue on SCC? any ideas?