1 of 1 people found this helpful
Guess I will start with the fans, but applies to both.
Check the SEL (System even Log) using the SELVIEW tool http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=17933&lang=eng
any event thay would cause the fans to run fast, should have logged an error.
Memory errors should also be getting logged. ( 12 x 2g DIMMs should be fine)
Look for anything under Critical.
Next I would go into BIOS set-up (F2) and restore the defaults F9
Then arrow over to Server Management and set the Resume on AC loss to Reset.
F10 to exit.
After the system finishes the POST on the reset remove the AC and add all your memory back in. Power up and confirm it still fails.
AC off again and disconnect any non critical connections to the mother board.
Primarily front panel, Power supply Aux Power connector and HDD hot swap bay. Then re-apply AC (system should start up by it self sence we set the reset on AC fail in BIOS) -- ( I am looking for possiable I2C bus conflect between the mother board DIMMS and trhe chassis.)
Thats about as deep as I should go without the SEL logs since I may already be heading down the wrong path.
Doc_SilverCreek, first of all thanks for you quick answer!
Concerning the fans:
SELVIEW seems to freeze at loading the IPMI drivers when starting through the EFI shell.
Works fine when using the linux version through Knoppix.
Unfortunately except chassis intrusion warnings there aren't any (recent) critical events that might lead us/me to the fan problem.
Considering the fact of the full blowing fans and the freezing SELVIEW in EFI, may it be a broken firmware?
I ran through the fan and chassis configuration, although it warned me because of the missing front temp. sensor (only on Intel chassis?) and missing front panel etc. It told me temp. measuring and therefore fan control might not work correctly - it did before "my" update.
After updating the firmware it told me to restart the system through the front switch - So I did reset it hard through the front button.
It kept on blowing for a few seconds and turned off then. Didn't come back until I powered it on again - (normal?).
My current settings through System Acoustics and Performance Configuration are:
Set Throttling Mode [Auto]
Altitude [301m - 900m]
Set Fan Profile [Performance]
Fan PWM Offset 0
At least changing Set Fan Profile doesn't have any function (any more! - worked before the update).
That is how I far I came with the memory yet:
1) Set BIOS to defaults (F9) and set Resume on AC loss to Reset.
2) Powered on the system /w 8 modules and booted Knoppix (CPU fans keep on full speed btw.).
3) Removed AC.
3) Removed all non-critical components: SATA drives (1x optical, 1x HDD), Storage controller, Front-USB connector, LEDs and switches (socket board) for power, reset, HDD, and GND.
4) Put AC back and system starts beeping and LEDs on board telling me memory modules C1 and C2 failed - running with 20 GB.
5) Switched them with other modules to see if its a channel or module problem. Unfortunately C1 and C2 keep failing :-( - Broken board or CPU?
That whould match with my try to run it with six modules (A1,B1,C1 and D1,E1,F1). It didn't tell me about failing C1 there, though.
Uploaded the SEL files: http://osprey.bjoern-gies.de/sel/
I cleared the BMC log before beginning my tests today.
Events before that are in file 00. Events from todays tests are inside 01.
Have a nice weekend. Greetings.
you got any suggestions left?
Unfortunately I didn't come any further with my two problems.
Can I somehow "reflash" the firmware and skip the fan configuration to get a generic one - that seemed to have worked on the old firmware?
1 of 1 people found this helpful
First guess. (can't confirm)
When you load the SDR's, it should ask you a few questions.
Select the function you desire to perform:"
"Update only the SDR repository"
"Update only the FRU repository"
"Update both the FRU and the SDR repository"
"Modify the Product Asset Tag"
Any you should select "Update only the SDR repository"
It should then probe to try to figure out what type of chassis the board is installed in.
"Auto detecting chassis type.... This may take upto two minutes based on configuration."
The Chenbro SR107 chassis should not be detected as any known chassis (i hope) so the next screen should ask :
"Select the Chassis:"
"Intel(R) Entry Server Chassis SC5650DP"
"Intel(R) Entry Server Chassis SC5650BRP"
"Intel(R) Entry Server Chassis SC5650WS"
"Intel(R) Entry Server Chassis SC5600BASE"
"Intel(R) Entry Server Chassis SC5600BRP"
"Intel(R) Entry Server Chassis SC5600LX"
You should select "Other Chassis"
System should report type of processors
Type of mother board
any HSC's or HSPB it finds
Then the next menu
"The options provided are intended for OEMs and system integrators to allow the"
"thermal control of fans in a third-party chassis. OEMs and system integrators"
"must perform their own thermal testing for any changes made using these"
"options. Intel cannot provide support for any changes made to fan settings to"
"support third-party chassis. Third-party chassis vendors may have recommended"
"settings for these configuration options for specific chassis."
"INTEL ASSUMES NO RESPONSIBILITY FOR UNDESIRED RESULTS WHEN USING ANY CUSTOM FAN CONTROL CONFIGURATION ON INTEL(R) SERVER PRODUCTS"
"Select a fan speed control profile for your chassis"
" Slow ramp "
" Medium ramp "
" Fast ramp "
" Full Speed Fans "
This choice is not clear cut.
If you select
Full Speed Fans. The fans will never slow down.
Slow ramp you might get overheating or you might not. (very chassis dependend, but the fans should run real quite)
Med or Fast are the safer choices.
Them comes the fan questions which need to be answered as you have your system configured.
"Is a fan connected to the Processor1 FAN connector?"
"Is a fan connected to the Processor2 FAN connector?"
"Is a fan connected to the SYS FAN1 connector?"
"Is a fan connected to the SYS FAN2 connector?"
"Is a fan connected to the SYS FAN3 connector?"
"Is a fan connected to the SYS FAN4 connector?"
"Is a fan connected to the SYS FAN5 connector?"
"Does the system have chassis intrusion?" - NO!!! No!! No !!! Your SEL looks like this is set to yes and floating which keeps loging an error.
The system may ramp the fans to 100% to help maintain air flow since the case is open. (even if it is not)
"Does the front panel support a NMI button?" usually No.
This SEL event is a bit concerning, but I assume you may have been expermenting with the fans and things got hot.
Temperature /IOH Thermal Trip (#0x6A) CRITICAL event: IOH Thermal Trip reports it has been asserted.
The memory does not need to be matched since the processors are independent. so 20g is valid
Now of the last 4 G
It looks like C1 is working OK alone, but when C2 is added both DIMMS on the C channel get disabled?
Most likly suspect is a very slightly bent pin in the CPU socket by these DIMMS. See http://communities.intel.com/message/110448#110448
Man, thank you very much for your detailed answer!
The chassis intrusion caused the fan issue! Set it to "no" and the fans run smooth. A little adjustment to the PWM offset (+40) since the Adaptec is getting quite warm and the whole rig is running in a good noise/temp ratio :-)
Memory channel C keeps failing though.
C1 alone -> fail
C1 + C2 -> both fail
C2 alone -> C2 disabled
But Ill look under CPU1 for a bent pin!
I'll report my results.
Have a nice weekend!
Greetings from Germany
unfortunately I didn't get channel C working.
No matter what combination.
Looked under CPU#1 (no broken/bent pins) and therefore placed it again.
So I have to assume that either the board or the CPU have a problem.
Thanks again for your time.
I am in the midst of putting together a system using the same board and x5690 cpus and 12 sticks of kingston for 96 GB.
Your symptoms could be from unequal tension on the heatsink screws/or oxides on the contacts. Sounds silly, but here's the scoop: These cpus's (as well as the i7 Socket 1366's) have a known issue with poor conductivity from cpu to some of the socket pins that has been resulting in non-recognition of memory on some MB's. As an electrical engineer and an owner of a computer shop for thirty years this year, it makes some sense to me. It is one possibility.
Solution that worked for me:
Remove the HS and CPU related to the bank of memory that has issues. Do not touch or try to remove oxides manually as this will cause additional issues. All that is needed is to reseat the CPU and gently slide it back and forth about 2 times to allow the offending pins to scratch into the cpu contacts. Clamp CPU down and then screw down the Heat sink. Now this is MOST important... follow the instructions from Intel which are to tighten 2 turns on one corner then 2 turns on the opostite corner (do this with a friend watching and pusing down on the heatsink as often this second screw will not catch until after the third or fourth turn causing the CPU to be lifted at one corner loosing contact with some pins).
Then do 2 turns on each of the 2 remaining screws.
Then repeat with 2 additional turns on the first screw then 2 additional turns on the opposite screw and 2 turns on the 2 remaining screws.
Done well, this will eliminate one common issue.
My issue with this board is it makes it difficult to not use an Intel chasiss and power supply. Mixing with Non Intel can result in overheating issues caused by not knowing how to set it up fans manually when taking it to the max with the 130Watt CPU's.
DON'T DO IT!!!!
The CPU socket pins are EXTREMELY fragile!
If you slide the CPU, you will bend pins and total you mother board!
The #1 cause of board failures is the CPU being slide while installing.
The contacts are GOLD on both the CPU and the pins. You might get thermal grease on them, but they will not oxidize.
There are a host of tools being produced specifically to prevent CPU's being slide into the socket on these boards.
The CPU is 100% tensioned by the CPU latching plate.
Tightening the heat sink as described is a good practice since it best prevents cross threading or binging the heat sink screws. (also never use a power screw driver. They spin too fact and can cause galling of the stainless steel screws)
SDR fan settings
130W processors puts out a lot of heat.
- Use a active heatsink with 130 W procs. solves many issues
- Use a front panel temperature sensor. This gives you a much better response when setting the SDR's
- Be awaire of the hot spots and make sure they good fan coverage.
- Any thing with a heat sink needs good air flow across the heat sink
- BGA (nics, ICH10, PCIe slots)
- If you try to use an Intel chassis SDR, check your system for hot spots very closely. These SDR's are tuned to work with the Intel chassis and other chassis will be some what different which means that they will likely need to be set differently.
- Use the OTHER option will allow you to select which fans you have connected, but make sure the fan is cooling the same general area that it was intended to cover. You can re-tune the SDR to cover non standard zones, but that is a boat load of work.
- In the Acoustics option tab (Advance - last line) you can set fan PWD offsets with the newer BIOS code stack. This allows you to load a more standard SDR then tweak all the fans up to get better cooling response were upi need it.
I will have to see if I have a simplified guide on how to create SDR's in my files still.
They are not too bad once you get into them and really have a lot more functionality than most people realize.
- Use a active heatsink with 130 W procs. solves many issues
Thanks for the solution, I had the same issue with the fans.
I have a Thermaltake Element V chassis with a Thermaltake Grand PSU and of course a S5520HC board.
My remaining problem is, that when I power off the machine, or only plug it in, the motherboard tells
the PSU to give some eletricity out, which is too much.
So when the machine is powered off, the fans vibrate, the optical drive is blinking, it's really annoying.
Do You have any clue about setting the power states of the board? Or any other suggestions?
Thanks for any help and sorry for my english
When AC is plugged in, the 5v Stand-by power from the power supply should be live on the mother board.
This powers the Baseboard Management Controller (bmc) and the NIC's.
Both allow remote management and remote power up.
The optical drive should have no power or any system or CPU fans execpt maybe a power supply fan.
If other devices are being powered, it sounds like a power supply or wiring issue.
NIC led's might flash since any broad cast network traffic will be responded to by the BMC NIC connection. (such as a DHCP server or router sending a ARP)
Thanks for the response, but unfortunately the optical drive is blinking
and there aren't any wires to mess with (SATA, and the power cable?)
Do You know any way to manage the power states, maybe disable the remote
management so assure that when I shut down the machine it really shuts down.
I will work on the problem too on the weekend, but thanks for any help.
What really annoys me is that there is not enough juice to power up, for example,
the CPU fans, but they keep on shaking.
Of course the NIC should work, but CPU fans are a little bit too much for a powered down machine.
Have you managed to launch c1 c2 slots? I have exactly the same problem, but i have tried to change motherboard, CPUs, DIMM with no success, always the same thing
Please help somebody:)
My problem with C1 C2 slots is resolved by changing chassis