3 Replies Latest reply on Apr 23, 2014 11:56 AM by Dan_O

    NMI S2600CP - find Device

    fs01

      Hi all,

       

      my vmWare Host halted because an NMI. The following Events appears in the Event Log:

       

      PCIe Fat Sensor: Surprise Link Down Error on bus: 0, device: 2, function: 2. - Asserted

      PCIe Cor Sensor: Receiver Error on bus: 0, device: 2, function: 2. - Asserted:

       

      How can i find out, which Device is causing the Interrupts? I have the Raidcontroller (LSI 9361-8i) in suspicion. But when i boot the system, the BIOS from the Raidcontroller prints "Bus 6 Device 0" on the Screen?!?

       

      System: INTEL Server System P4308CP4MHGC

      Mainboard:S2600CP4

       

       

      Thank you!

       

      Edit:

       

      ~ # lspci

      0000:00:00.0 Bridge: Intel Corporation Ivytown DMI2 [PCIe RP[0000:00:00.0]]

      0000:00:01.0 Bridge: Intel Corporation Ivytown PCI Express Root Port 1a [PCIe RP[0000:00:01.0]]

      0000:00:01.1 Bridge: Intel Corporation Ivytown PCI Express Root Port 1b [PCIe RP[0000:00:01.1]]

      0000:00:02.0 Bridge: Intel Corporation Ivytown PCI Express Root Port 2a [PCIe RP[0000:00:02.0]]

      0000:00:02.2 Bridge: Intel Corporation Ivytown PCI Express Root Port 2c [PCIe RP[0000:00:02.2]]

      0000:00:03.0 Bridge: Intel Corporation Ivytown PCI Express Root Port 3a [PCIe RP[0000:00:03.0]]

      0000:00:03.2 Bridge: Intel Corporation Ivytown PCI Express Root Port 3c [PCIe RP[0000:00:03.2]]

      ...

      0000:06:00.0 Mass storage controller: LSI MegaRAID SAS Invader Controller [vmhba0]

        • 1. Re: NMI S2600CP - find Device
          Dan_O

          bus 0 device 2 function 2 routes to slot #4 normally (the fourth PCIe slot, counting from the edge of the board inward).

           

          was there any device populated there previously?

           

          You can look up the lanes on page 47 (63 of 231) of the TPS, at Intel´┐Ż Server Board S2600CP Family — Technical Product Specification , if the reported bus number ever changes after POST.

          1 of 1 people found this helpful
          • 2. Re: NMI S2600CP - find Device
            fs01

            Thank you for the link, Dan_O. In Slot 4 was the Raid Controller populated.

             

            In the meantime i have replaced the Raid Controller with a new one (i put it this time in Slot 6). Then i updated the Firmware of the mainboard. During the ME Update the system stops again due to NMI Interrupt (same NMI Message, but this time Slot 6, where the Raid Controller is now plugged).

             

            What I noticed: each interrupt appeared as the system fans turn up (for example during the me update or when i changed the system accoustic settings in the bios to "Performance")?!?

             

            What I can do?

             

            Thanks all!

            • 3. Re: NMI S2600CP - find Device
              Dan_O

              So, two things:  one, it is normal that the fans ramp up when there is an error.  two, if you pull the RAID card, can you update the firmware (including ME) with no errors or interrupts?  If you can, do that first, then after it's done, shut down, pull AC power for 20 seconds, then boot into the BIOS and do an F9 to restore defaults.  After doing that, shut down and pull AC again, put the RAID card back in, and boot up and update the RAID card firmware by itself.