1 of 1 people found this helpful
Could be power, thermal or vibration issue. You may try to start with minimal configuration first, and add components one at a time to isolate the cause.
Any thing reported in the SEL log?
It's a "S" SKU, which doesn't have BMC...
Thank you, Edward. Thank you Creek.
Could be Power:
The three units of 24-bay server were put together in the same 19" 42U rack. So, we pulled power source from the next rack to support two of the three 24-bay servers. In the left rack, we have one 24-bay server and three 2U Dell servers. The other two 24-bay servers used the power cords from the right rack which contains nothing but the two 24-bay servers. 20 hours passed, the one using the power source of left rack got sudden death. After 6 hours, one of the 24-bay server using the power source from the right rack got sudden death. Holly, should not be the power source problem.
However it's a good idea to add components (like HDD I guessed) especially we prepared another two chassis (Supermicro SC846E16-R1200) which will be tested with S1200BTS and 24 x HDS 2TB 7.2k 6Gb SATA drives.
Anything in SEL ?
I can say it's not thing in the SEL related to the sudden death. No screen (black), nothing we can find or read but the LEDs (power, NIC, post code diagnostic) all looked normal. The dead one has the same LED status as the alive ones. But yes, we did see something after the re-power-on but were not telling anything about the hang/dead/black-screen/... I have read through all the logs including the huge stuff related to Microsoft, all didn't say anything about it. Nothing for us to investigate.
However, the weird thing was, just replaced the S1200BTS with S3420GPLC solved the problem. It's that simple. Replacing it with another two new S1200BTS didn't help. We found the systems were all not busy working when it got sudden death -- from the Windows system event logs. We doubt it could be the BIOS ver. I don't know, we will try the new version of BIOS 029 dedicated for S1200BTS (the ver. 030 is for S1200BTL only).
ahhh, missed that. Guess we are even.
hi everyone, it has been a while and seems no progress on this issue. we at last replaced all those S1200BTS with new purchased S3420GPLC which is relatively reliable and more proven compatibleness we believe. And it wroks well without any strange things. So that business was secured finally and fortunately.
NOPE, even the S3400GPLC didn't solve the problem with the Supermicro SC846E16 chassis (with 6G SAS expander). It's 3 months passed, this combiniation did't win the customer satisfaction. Because, the new system got reboot sometimes. Reboot looks better than sudden death with black-screen. However, it's still a very big problem for a server (or servers) here. We have not yet resolved the issue. We heard rumours that,
"when you purchased the Supermicro chassis, just use their server board. Their own server board matches the special design of their power system."
I forgot to mention,
We built a new system with a new purchased Supermicro 846E16 24-bay chassis (with 6G SAS expander inside) and a new S5500-HCV with E5605 CPU. Yeah, a new set with latest firmware. We spent three day-and-night to test it with enthiusiasm. The new system reboot every night. We are sure now have to test with the Supermicro server board for this g.d. chassis.
Whatever, we just placed the order to buy Supermicro X9SCL server board. It probably would come in 2 weeks. We will see. No tears, all in the stomatch.