I've heard about strange behaviors when PSUs are connected to some models of UPS... What's the model of the UPS you're using? Do you have repeated errors on both PSUs?
Seem to get this
email everyday, system has two power supplies and I am trying to find documentation on how to figure our what the different error codes mean. The most bothersome is the one about the Power Supplies shown below. Get one email for PS1 and one for PS2. Both are plugged into different UPSs'.
Event that generated this alert:
RID:0160 TS:05/18/2014 05:48:01 SN:PS1 Status ST:Power Supply ED:Predictive Failure ET:Asserted EC:Non-Critical
RID:0160 RT:02 TS:53784991 GID:0020 ER:04 ST:08 S#:50 ET:6F ED:A2 06 30 EX:01 FF FF FF FF FF FF FF
Are they both failing?
Digging this out requires the IPMI specification, the platform specification and any specific device specifications (PMbus in this case)
or the trouble shooting guide. Server Products — System Event Log troubleshooting guides
The Intel SELVIEW tool provides a pretty good translation, but the ED: does contain a tiny bit more, which is the Power supply is complaining that the input AC is too low.
and as Edward noted, some UPSs "AC" outputs leave a lot to be desired. (a pulsating DC square wave does not provide the same power as a true AC syne wave)
RID = Line number in SEL
RT = Record type = System Event
TS = Time Stamp
GID = Generator ID = 20 BMC did the logging
ER = Event Message Type = IPMI 2.0
ST = Sensor Type = 08 = Power Supply
S# = Sensor Number = 50 = Power Supply 1 from SDR file
ET = Event Type = 6F = 0110 1111 (bit 7 =0 = Assert) Bit 0 to 6 = 6F = Sensor Specific event (in this case PMbus event)
ED = Event Data = The good stuff.
A2 = valid OEM data in the next 2 bytes
06 = Input voltage to the power supply LOW
30 = PMbus "Status_Input" byte = Input under voltage warrning and input under voltage fault.
EX = Extended Data = 01 ff ff ff ff ff ff ff ff = Not used for this sensor
Will have to dig out the other info, both UPSs' are APC 1500s' rack mounts. Don't have model numbers here. Got to run, server down, but will get more info as time allows.
Not sure if this was resolved yet or not. But at my company we've experienced similar issues. Had to do with the sine wave that the UPS is built to use. The Intel Server Systems come with
PSUs that require Pure Sine Waveform output from a UPS. Most of the cheaper UPS' do not use Pure Sine waves and instead use a stepped
approximation to one. Both Intel and APC Support confirmed this for us and both are getting lots of reports about it. There is no workaround other than using the
correct type of UPS or no UPS.
APC has two lines of UPS', Back-UPS which uses the stepped waveform, and Smart-UPS which uses Pure Sine Wave. The problem is the difference in price is pretty big.
Hope this helps..
Well I finally got around to checking my UPS info and they are both Smart UPS 1500s' so I am not sure that is the issue. Have issues with a JBOD2000 array also. The problem with the JBOD happens when AC power is restored. Don't know what to make of the alerts, and now have a customer with a pair of these in different locations. One gets the warning and one does not.