I have redone the instructions using emac ports B and D and no activity lights come on and it does not work
1 of 1 people found this helpful
Unfortunately, we are experiencing some difficulty with 1.4.0 with eMAC enabled, but I think we are only days away from a fix. The problem is described in http://marcbug.scc-dc.com/bugzilla3/show_bug.cgi?id=264
What I found worked with another group today (Brown Univ) was to install 1.4.0 (not 1.4.1 or 18.104.22.168) and disable the eMAC interface. The link to the bug has an attachment called How to disable eMAC and that should take you through the steps.
When I installed 22.214.171.124 and disabled the eMAC interface, I saw the MCPC crash when I tried to boot SCC Linux. Brown saw this same problem with their system.But when they dropped back to 1.4.0 and disabled eMAC, they were once again able to work. The bug also has attached files in.rck.zone and ex.rck.zone.
Please try 1.4.0 and eMAC disabled. If this does not work, please file a Bugzilla bug.
Thats unfortunate because on 1.3 my mpi program hangs (mpirun -np 1 works, not to exciting though), I believe due to much input and output and correct me if I'm wrong but is the new direction of traffic over the emac ports suppose to fix that bug. Whats new to 1.4 if emac ports are disabled?
There's a doc that lists new 1.4.0 features on this site, but you are correct ... eMAC is the major new thing. However, I honestly believe that the fix is very close. It's at high priority.
I'm a little confused about the firmware that should be flashed for emac ports disabled. What should be flashed and what should my systemSetting.ini look like. Also what does your /etc/network/interfaces file look like
Use the latest firmware. You'll notice that the bitstreams have a _ab or a _cd in their name. When eMAC is enabled, you want the bitstream that is for the eMAC ports that you are using. But when eMAC is disabled you're using the non-eMAC portion of the bitstream and so it shouldn't matter which you use.
On one of our internal marc systems we have rl_20110627_cd.bit. You can see what bitstream you are using with the "sccBmc -c set" command. Or just telnet into the BMC and issue set.
When you install the bitstream with install.csh, you want the serial number in update.txt to be larger than the serial number in the update.txt in /flash on the BMC. If install.csh does not work, issue (as root) "apt-get remove crbif-dkms" before running it.
You want the number in /opt/sccKit/current/firmware/RockyLake/update/update.txt to be larger than that in /flash/update.txt on the BMC. Since you have your own MCPC/SCC you can log into the BMC ar root@<the BMC IP address>. The root password is in install.csh and if you can become root on your MCPC you can read that.
With /etc/network/interfaces ... comment our the eth1:1. You don't have to actually disconnect the switch, but you can if you want. You do want eth1 to be on the same subnet as the BMC. eth0 is your outside ethernet connection. Ths interfaces file has a quirk ... it will not accept spaces at the end of lines. We discovered this painfully.
iface lo inet loopback
iface eth0 inet dhcp <== we use dhcp in our Data Center but users often have a static eth0.
up service bind9 restart
iface eth1 inet static
address 10.3.16.25 <== our BMC IP for this machine is 10.3.16.125
iface crb0 inet static
up route add -net 192.168.0.0 netmask 255.255.255.0 gw 192.168.1.1 dev crb0
down route del -net 192.168.0.0 netmask 255.255.255.0 dev crb0
Thanks its working now
We have a sccKit_126.96.36.199.tar.bz2 on our SVN http://marcbug.scc-dc.com/svn/repository/tarballs/
If you had to disable eMAC on your 1.4.0, you should now be able to install 1.4.1 with patches 1 and 2 (called 188.8.131.52) and enable eMAC. This is a fix for bug 264 http://marcbug.scc-dc.com/bugzilla3/show_bug.cgi?id=264
We have tried to install 184.108.40.206. sccBmc, sccReset and sccBoot appear to work. soon after the performance meters show that the cores are booted. After the cores booted we did a tcpdump and we found that no packets are leaving the emac ports (Note: we put a hub between the emac port and the switch to monitor network traffic). We can see that the management pc is sending arp requests to the emac port but still no response from the emac port.
Also we have tried every usable emac port. 1101