13 Replies Latest reply on Mar 30, 2012 6:18 AM by snikers

    Two S5520HC issues

    BjoernG

      Hello together,

       

      I'm installing an Intel S5520HC board with two Intel Xeon E5520, KVR1333D3E9SK3/6GI (Kingston 3x 2GB ECC Kit) in a Chenbro SR107 chassis with three 120mm PWM fans connected to the board. Storage controller is an Adaptec 5805 connected to eight 1TB Seagate drives. Cooling on the CPUs are Intel STS100A each.

       

      Now I got two issues:

      1) I can't get the system booting with more than 8 memory modules (=16 GB). I can't find any limitation in the documentation to have 12x 2GB ECC (Non-Reg.) memory (=24 GB). Diagnostic LEDs telling me the SPD would be "too bad to run". I mixed the modules, tried other combinations but 8 modules is the maximum I can get to work.

       

      2) I updated the firmware to the most current version (December 29, 2010, System BIOS  - 55; BMC Firmware - 00.55; ME Firmware  - 1.12; FRUSDR       - 30). Since the release notes wanted me to update to a firmware with BMC 0.40 first, I stepped to the version from July 1st, 2009 first (System BIOS  - 38; BMC Firmware - 00.40; ME Firmware  - 1.11; FRUSDR       - 19).

      Now the CPU fans seem to be running on full speed all the time. Chassis fans seem to be "ok" (felt). This happens since the firmware update.

       

      Thank you very much in advance!

       

      Greetings

       

      Björn

        • 1. Re: Two S5520HC issues
          Doc_SilverCreek

          Guess I will start with the fans, but applies to both.

          Check the SEL (System even Log) using the SELVIEW tool http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=17933&lang=eng

          any event thay would cause the fans to run fast, should have logged an error.

          Memory errors should also be getting logged. ( 12 x 2g DIMMs should be fine)

          Look for anything under Critical.

           

          Next I would go into BIOS set-up (F2) and restore the defaults F9

          Then arrow over to Server Management and set the Resume on AC loss to Reset.

          F10 to exit.

          After the system finishes the POST on the reset remove the AC and add all your memory back in. Power up and confirm it still fails.

          AC off again and  disconnect any non critical connections to the mother board.

          Primarily front panel, Power supply Aux Power connector and HDD hot swap bay. Then re-apply AC (system should start up by it self sence we set the reset on AC fail in BIOS) -- ( I am looking for possiable I2C bus conflect between the mother board DIMMS and trhe chassis.)

           

          Thats about as deep as I should go without the SEL logs since I may already be heading down the wrong path.

          1 of 1 people found this helpful
          • 2. Re: Two S5520HC issues
            BjoernG

            Doc_SilverCreek, first of all thanks for you quick answer!

             

            Concerning the fans:

            SELVIEW seems to freeze at loading the IPMI drivers when starting through the EFI shell.

            Works fine when using the linux version through Knoppix.

            Unfortunately except chassis intrusion warnings there aren't any (recent) critical events that might lead us/me to the fan problem.

            Considering the fact of the full blowing fans and the freezing SELVIEW in EFI, may it be a broken firmware?

            I ran through the fan and chassis configuration, although it warned me because of the missing front temp. sensor (only on Intel chassis?) and missing front panel etc. It told me temp. measuring and therefore fan control might not work correctly - it did before "my" update.

             

            After updating the firmware it told me to restart the system through the front switch - So I did reset it hard through the front button.

            It kept on blowing for a few seconds and turned off then. Didn't come back until I powered it on again - (normal?).

             

            My current settings through System Acoustics and Performance Configuration are:

            Set Throttling Mode          [Auto]

            Altitude                           [301m - 900m]

            Set Fan Profile                 [Performance]

            Fan PWM Offset              0

             

            At least changing Set Fan Profile doesn't have any function (any more! - worked before the update).

             

            That is how I far I came with the memory yet:

            1) Set BIOS to defaults (F9) and set Resume on AC loss to Reset.

            2) Powered on the system /w 8 modules and booted Knoppix (CPU fans keep on full speed btw.).

            3) Removed AC.

            3) Removed all non-critical components: SATA drives (1x optical, 1x HDD), Storage controller, Front-USB connector, LEDs and switches (socket board) for power, reset, HDD, and GND.

            4) Put AC back and system starts beeping and LEDs on board telling me memory modules C1 and C2 failed - running with 20 GB.

            5) Switched them with other modules to see if its a channel or module problem. Unfortunately C1 and C2 keep failing :-( - Broken board or CPU?

            That whould match with my try to run it with six modules (A1,B1,C1 and D1,E1,F1). It didn't tell me about failing C1 there, though.

             

            Uploaded the SEL files: http://osprey.bjoern-gies.de/sel/

            I cleared the BMC log before beginning my tests today.

            Events before that are in file 00. Events from todays tests are inside 01.

             

            Thanks again.

             

            Have a nice weekend. Greetings.

             

            Björn

            • 3. Re: Two S5520HC issues
              BjoernG

              Hey Doc_SilverCreek,

               

              you got any suggestions left?

              Unfortunately I didn't come any further with my two problems.

              Can I somehow "reflash" the firmware and skip the fan configuration to get a generic one - that seemed to have worked on the old firmware?

               

              Thanks again.

               

              Björn

              • 4. Re: Two S5520HC issues
                Doc_SilverCreek

                Hmm,

                 

                First guess. (can't confirm)

                 

                When you load the SDR's, it should ask you a few questions. 

                Select the function you desire to perform:"
                     "Update only the SDR repository"
                     "Update only the FRU repository"
                     "Update both the FRU and the SDR repository"
                     "Modify the Product Asset Tag"
                     "Exit"

                 

                Any you should select "Update only the SDR repository"

                It should then probe to try to figure out what type of chassis the board is installed in.

                      "Auto detecting chassis type.... This may take upto two minutes based on configuration."

                 

                The Chenbro SR107 chassis should not be detected as any known chassis (i hope) so the next screen should ask :

                "Select the Chassis:"

                       "Intel(R) Entry Server Chassis SC5650DP"
                       "Intel(R) Entry Server Chassis SC5650BRP"
                       "Intel(R) Entry Server Chassis SC5650WS"
                       "Intel(R) Entry Server Chassis SC5600BASE"
                       "Intel(R) Entry Server Chassis SC5600BRP"
                       "Intel(R) Entry Server Chassis SC5600LX"
                       "Other Chassis"

                 

                You should select "Other Chassis"

                System should report type of processors

                Type of mother board

                any HSC's or HSPB it finds

                 

                Then the next menu

                                    "The options provided are intended for OEMs and system integrators to allow the"
                                    "thermal control of fans in a third-party chassis. OEMs and system integrators"
                                    "must perform their own thermal testing for any changes made using these"
                                    "options. Intel cannot provide support for any changes made to fan settings to"
                                    "support third-party chassis. Third-party chassis vendors may have recommended"
                                    "settings for these configuration options for specific chassis."
                                    "INTEL ASSUMES NO RESPONSIBILITY FOR UNDESIRED RESULTS WHEN USING ANY CUSTOM FAN CONTROL CONFIGURATION ON INTEL(R) SERVER PRODUCTS"
                                   

                "Select a fan speed control profile for your chassis"
                   " Slow ramp "
                   " Medium ramp "
                   " Fast ramp "
                   " Full Speed Fans "

                This choice is not clear cut.

                If you select

                Full Speed Fans. The fans will never slow down.

                Slow ramp you might get overheating or you might not. (very chassis dependend, but the fans should run real quite)

                Med or Fast are the safer choices.

                 

                Them comes the fan questions which need to be answered as you have your system configured.

                "Is a fan connected to the Processor1 FAN connector?"

                "Is a fan connected to the Processor2 FAN connector?"

                "Is a fan connected to the SYS FAN1 connector?"

                "Is a fan connected to the SYS FAN2 connector?"

                "Is a fan connected to the SYS FAN3 connector?"

                "Is a fan connected to the SYS FAN4 connector?"

                "Is a fan connected to the SYS FAN5 connector?"

                 

                And

                "Does the system have chassis intrusion?"   - NO!!!  No!! No !!!  Your SEL looks like this is set to yes and floating which keeps loging an error.

                The system may ramp the fans to 100% to help maintain air flow since the case is open. (even if it is not)

                 

                "Does the front panel support a NMI button?"  usually No.

                 

                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

                 

                This SEL event is a bit concerning, but I assume you may have been expermenting with the fans and things got hot.

                Temperature /IOH Thermal Trip (#0x6A)CRITICAL event: IOH Thermal Trip reports it has been asserted.

                 

                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

                The memory does not need to be matched since the processors are independent. so 20g is valid

                Now of the last 4 G

                 

                It looks like C1 is working OK alone, but when C2 is added both DIMMS on the C channel get disabled?

                Most likly suspect is a very slightly bent pin in the CPU socket by these DIMMS.  See http://communities.intel.com/message/110448#110448

                1 of 1 people found this helpful
                • 5. Re: Two S5520HC issues
                  BjoernG

                  Man, thank you very much for your detailed answer!

                   

                  The chassis intrusion caused the fan issue! Set it to "no" and the fans run smooth. A little adjustment to the PWM offset (+40) since the Adaptec is getting quite warm and the whole rig is running in a good noise/temp ratio :-)

                   

                  Memory channel C keeps failing though.

                  C1 alone -> fail

                  C1 + C2 -> both fail

                  C2 alone -> C2 disabled

                  But Ill look under CPU1 for a bent pin!

                  I'll report my results.

                   

                  Have a nice weekend!

                   

                  Greetings from Germany

                   

                  Björn

                  • 6. Re: Two S5520HC issues
                    BjoernG

                    Hey,

                     

                    unfortunately I didn't get channel C working.

                    No matter what combination.

                    Looked under CPU#1 (no broken/bent pins) and therefore placed it again.

                     

                    So I have to assume that either the board or the CPU have a problem.

                     

                    Thanks again for your time.

                     

                    Greetings

                     

                    Björn

                    • 7. Re: Two S5520HC issues

                      I am in the midst of putting together a system using the same board and x5690 cpus and 12 sticks of kingston for 96 GB.

                       

                      Your symptoms could be from unequal tension on the heatsink screws/or oxides on the contacts.  Sounds silly, but here's the scoop:  These cpus's (as well as the i7 Socket 1366's) have a known issue with poor conductivity from cpu to some of the socket pins that has been resulting in non-recognition of memory on some MB's.  As an electrical engineer and an owner of a computer shop for thirty years this year, it makes some sense to me.  It is one possibility.

                       

                      Solution that worked for me:

                       

                      Remove the HS and CPU related to the bank of memory that has issues.  Do not touch or try to remove oxides manually as this will cause additional issues.  All that is needed is to reseat the CPU and gently slide it back and forth about 2 times to allow the offending pins to scratch into the cpu contacts. Clamp CPU down and then screw down the Heat sink.  Now this is MOST important... follow the instructions from Intel which are to tighten 2 turns on one corner then 2 turns on the opostite corner (do this with a friend watching and pusing down on the heatsink as often this second screw will not catch until after the third or fourth turn causing the CPU to be lifted at one corner loosing contact with some pins).

                       

                      Then do 2 turns on each of the 2 remaining screws.

                       

                      Then repeat with 2 additional turns on the first screw then 2 additional turns on the opposite screw and 2 turns on the 2 remaining screws.

                       

                      Your done.

                       

                      Done well, this will eliminate one common issue.

                       

                      My issue with this board is it makes it difficult to not use an Intel chasiss and power supply.  Mixing with Non Intel can result in overheating issues caused by not knowing how to set it up fans manually when taking it to the max with the 130Watt CPU's.

                      • 8. Re: Two S5520HC issues
                        Doc_SilverCreek

                        DON'T DO IT!!!!

                         

                        The CPU socket pins are EXTREMELY fragile!

                        If you slide the CPU, you will bend pins and total you mother board!

                        The #1 cause of board failures is the CPU being slide while installing.

                        The contacts are GOLD on both the CPU and the pins. You might get thermal grease on them, but they will not oxidize.

                        There are a host of tools being produced specifically to prevent CPU's being slide into the socket on these boards.

                         

                        The CPU is 100% tensioned by the CPU latching plate.

                        Tightening the heat sink as described is a good practice since it best prevents cross threading or binging the heat sink screws. (also never use a power screw driver. They spin too fact and can cause galling of the stainless steel screws)

                         

                         

                        SDR fan settings

                        130W processors puts out a lot of heat.

                         

                        1. Use a active heatsink with 130 W procs. solves many issues
                        2. Use a front panel temperature sensor. This gives you a much better response when setting the SDR's
                        3. Be awaire of the hot spots and make sure they good fan coverage.
                        4. Any thing with a heat sink needs good air flow across the heat sink
                        5. BGA (nics, ICH10, PCIe slots)
                        6. HDD's
                        7. If you try to use an Intel chassis SDR, check your system for hot spots very closely. These SDR's are tuned to work with the Intel chassis and other chassis will be some what different which means that they will likely need to be set differently.
                        8. Use the OTHER option will allow you to select which fans you have connected, but make sure the fan is cooling the same general area that it was intended to cover. You can re-tune the SDR to cover non standard zones, but that is a boat load of work.
                        9. In the Acoustics option tab (Advance - last line) you can set fan PWD offsets with the newer BIOS code stack. This allows you to load a more standard SDR then tweak all the fans up to get better cooling response were upi need it.

                         

                         

                        I will have to see if I have a simplified guide on how to create SDR's in my files still.

                        They are not too bad once you get into them and really have a lot more functionality than most people realize.

                        • 9. Re: Two S5520HC issues

                          Hi All!

                           

                          Thanks for the solution, I had the same issue with the fans.

                           

                          I have a Thermaltake Element V chassis with a Thermaltake Grand PSU and of course a S5520HC board.

                           

                          My remaining problem is, that when I power off the machine, or only plug it in, the motherboard tells

                          the PSU to give some eletricity out, which is too much.

                          So when the machine is powered off, the fans vibrate, the optical drive is blinking, it's really annoying.

                           

                          Do You have any clue about setting the power states of the board? Or any other suggestions?

                           

                          Thanks for any help and sorry for my english

                           

                          bosko

                          • 10. Re: Two S5520HC issues
                            Doc_SilverCreek

                            When AC is plugged in, the 5v Stand-by power from the power supply should be live on the mother board.

                             

                            This powers the Baseboard Management Controller (bmc) and the NIC's.

                            Both allow remote management and remote power up.

                             

                            The optical drive should have no power or any system or CPU fans execpt maybe a power supply fan.

                            If other devices are being powered, it sounds like a power supply or wiring issue.

                             

                            NIC led's might flash since any broad cast network traffic will be responded to by the BMC NIC connection. (such as a DHCP server or router sending a ARP)

                            • 11. Re: Two S5520HC issues

                              Thanks for the response, but unfortunately the optical drive is blinking

                              and there aren't any wires to mess with (SATA, and the power cable?)

                               

                              Do You know any way to manage the power states, maybe disable the remote

                              management so assure that when I shut down the machine it really shuts down.

                               

                              I will work on the problem too on the weekend, but thanks for any help.

                               

                              What really annoys me is that there is not enough juice to power up, for example,

                              the CPU fans, but they keep on shaking.

                               

                              Of course the NIC should work, but CPU fans are a little bit too much for a powered down machine.

                              • 12. Re: Two S5520HC issues
                                snikers

                                Hello

                                Have you managed to launch c1 c2 slots? I have exactly the same problem, but i have tried to change motherboard, CPUs, DIMM with no success, always the same thing

                                Please help somebody:)

                                • 13. Re: Two S5520HC issues
                                  snikers

                                  My problem with C1 C2 slots is resolved by changing chassis