2 Replies Latest reply on Feb 26, 2016 4:42 AM by Alessandro.Carloni

    Intel modular server - Ram failure. How to identify the broken bank?

    Alessandro.Carloni

      Hi everyone,

      one of my two server had warning led on.

      Web interface say:

      ID:
      6417
      Type:
      IPMI
      Detailed Description:
      Sensor: Fault Indication; The Server 2 Fault LED has been illuminated. This amber LED is located on the front of the server and is used to warn of conditions that require attention.
      Cause:
      Chassis Sensor: Fault Indication
      Action:
      Check the Server 2 events tab for additional events that may show the cause of the fault. Using the graphical user interface, mouse over the server's status icon to reveal conditions that may contribute to the server fault. Take actions to clear these fault conditions and the Fault LED will be turned off.
      Extra Data:
      s:68:"Raw IPMI (hex): Gen:2000 Num:5b Type:c0 EDir:6f ED1:03 ED2:ff ED3:ff";

      From the linux server running inside the host, the log says that a RAM BANK is broken.

      How can I identify the fault one, without rebooting server with a RAM check tools?

       

      Thanks

        • 1. Re: Intel modular server - Ram failure. How to identify the broken bank?
          Dan_Intel

          Hello,

           

          We would like to inform you that the modular server is end of interactive support (EOIS) reason why the only source of support we offer is the website. As an additional recommendation you can login into the CMM and it should list the bank there.

          • 2. Re: Intel modular server - Ram failure. How to identify the broken bank?
            Alessandro.Carloni

            YEah I already Know that IMS is in EOL and EOS, and I'm furious with Intel for this. I can't explain to my boss that a server bought 3  years ago for about 20k euros is already in end of life.

             

            BTW, going back to your answer, I can see in CMM only this:

            SlotPresentTypeSizeData
            Width
            SpeedManuf.
            ID
            Part#Serial#
            DIMM_A1 Yes DDR3 8192MB 64 1333MHz 0x802C 36JSZF1G72PZ-1G4D1 0xE07DA276
            DIMM_A2 No
            DIMM_B1 Yes DDR3 8192MB 64 1333MHz 0x802C 36JSZF1G72PZ-1G4D1 0xDA6CF1C1
            DIMM_B2 No
            DIMM_C1 Yes DDR3 8192MB 64 1333MHz 0x802C 36JSZF1G72PZ-1G4D1 0xE07BF99B
            DIMM_C2 No
            DIMM_D1 Yes DDR3 8192MB 64 1333MHz 0x802C 36JSZF1G72PZ-1G4D1 0xE07DA272
            DIMM_D2 No
            DIMM_E1 Yes DDR3 8192MB 64 1333MHz 0x802C 36JSZF1G72PZ-1G4D1 0xE07DA21D
            DIMM_E2 No
            DIMM_F1 Yes DDR3 8192MB 64 1333MHz 0x802C 36KSF1G72PZ-1G4M1  0xE76944D1
            DIMM_F2 No

            So I can't figure out where I can find the information about broken RAM.