1 2 Previous Next 15 Replies Latest reply: Oct 6, 2011 9:36 AM by Shaun RSS

Two years of trouble with S5000PSL

Fulco Community Member
Currently Being Moderated

Two years of trouble with S5000PSL

 

Problems:

 

Occasional Crashes: BSOD, hangs, or restarts: most of the time after crash/restart one or more Drive Cages is missing. Need to pull the power-cables. Wait a few minutes. Connect cables, boot and drives are back.

Most of the time No crash information in Windows.

 

Restart in OS: often one or more Drive Cages are missing (after restart NOT A CRASH). Need to pull the power-cables. Wait a few minutes. Connect cables, boot and drives are back.

 

RMM2: no remote console posible (Vista/MacOS/XP, can login to RMM2)

RMM2: first page of SEL log not visible

 

Intel Management: software interferes with windows OS: unable to install and use Intel Management

 

OS:

Windows Small Business Server 2008 SP1 (64 bit)

Windows Small Business Server 2003 R2 SP2 (32 bit) (replaced by SBS 2008)

 

Hardware (Intel):

Chassis: SC5400LX

Motherboard: S5000PSL

RMM: RMM2

Drive cages: AXX4DRV3GEXP and AXX6DRV3GEXP

 

Firmware/Drivers:

all components have latest versions

 

 

Intel Support:

Dozens of emails: problem still exists

 

So far I tried:

- Replaced all Harddisks (found several problems with Seagate Barracuda 7200.11 SATA II Disk Drives)

- Replaced Drive Cages's EXP4 and EXP6

- Replaced RAID controller (Intel SRCSAS18E -> Adaptec 5805) (both crash in the same way)

- Replaced OS (new clean installations): before SBS 2003 -> now SBS 2008 (both crash in the same way)

- Replaced RAM

 

 

Please help,

Fulco

  • 1. Re: Two years of trouble with S5000PSL
    Axel Community Member
    Currently Being Moderated

    Have you captured the BSOD you receive? Is it always the same? ....

     

    Also you mentioned that sometimes the Hot-swap cage is missing.... which cables you have plugged from the backplane to the controller card and/or motherboard?

     

    Are those 3.0 Serial ATA hard drives?.... have you tried forcing the hard drives to work at 1.5? ....

     

    let me know..

  • 2. Re: Two years of trouble with S5000PSL
    Fulco Community Member
    Currently Being Moderated

    Most of the time the system just hangs or crashes, without BSOD.

    If there was a BSOD, it was a different one.

    So I can't give any more details why the system crashed.

     

    Lots of the time, even after a normal restart, one or more drive-cages (with drives) is missing.

    I switched/changed cages (with backplane), and cables.

    No pattern was emerging: it happened in all combinations (between 4 drive-cages).

     

    No, I didn't force the HD to use SATA 1.5.

    I only found the Barracuda's had the most problems.

    So I replaced them with WD drives.

    This (?) resulted in less crashes.

  • 3. Re: Two years of trouble with S5000PSL
    Axel Community Member
    Currently Being Moderated

    I have seen many issues using sata 2.0 hard drives, the vibration of these hard drives cause issues like the one you are experiencing; so as a suggestion try to force them to work at 1.5 and check the behavior.

     

    Additionally, i noticed there is a new backplane update available and there are LOTS of fixes on that one, download the firmware and perform the update.

     

    http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&Inst=Yes&ProductID=2414&DwnldID=17719&strOSs=All&OSFullName=All%20Operating%20Systems&lang=eng

    6 Bay expander

     

    http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&Inst=Yes&ProductID=2414&DwnldID=17720&strOSs=All&OSFullName=All%20Operating%20Systems&lang=eng

    4 Bay expander

  • 4. Re: Two years of trouble with S5000PSL
    Fulco Community Member
    Currently Being Moderated

    I start testing backplane firmware v 2.12.

    Thanks for link to firmware!

  • 5. Re: Two years of trouble with S5000PSL
    Fulco Community Member
    Currently Being Moderated

    Problems:

     

    I tried to install an ISDN-card (PrimuX S0 NT).

    Even with extended support from the manufactures, we didn't succeed get the product to work with S5000PSL.

    Also we tried the PrimuX USB, which didn't work either. There where USB-Resets disconnecting the device.

  • 6. Re: Two years of trouble with S5000PSL
    Fulco Community Member
    Currently Being Moderated

    The update of the HBA firmware to version 2.12 seems to solve” the missing drives after restart” (genuine Restarts and Crashes) issue.

    However the BSOD are still there:

    STOP 0x...D1, 0x,...34C

    DRIVER_IRQL_NOT_LESS_OR_EQUAL

    tscpip.sys

    This happens if Adaptec Agent (Service) is enabled.

    On our system this error is reproducible.

    I did at least 3 installations!

    Adaptec, is using the S500PSL board, unable to reproduce this error.

    This time the crash damaged the boot sector (?): chckdsk C: "second NTFS boot sector unwritable" and Windows Backup fails every time.

    Fulco

  • 7. Re: Two years of trouble with S5000PSL
    pjm0 Community Member
    Currently Being Moderated

    Did the HBA issues (missing drives) come back or does that seem to be fixed? I've been fighting the same problem (I think) for a while now and sometimes it would appear to be fixed for a few weeks and then come back.

     

    When you were having issues with the "missing drives" were you getting MR_MONITOR warnings (a lot of them) in the Event logs?

     

    I seem to be having the same issue with the drives missing and I was getting a lot of MR_MONITOR errors in the Event logs? I just tried the HBA firmware update but It has only been a few days without errors so I'm not sure if it has fixed my problem or not.

  • 8. Re: Two years of trouble with S5000PSL
    Fulco Community Member
    Currently Being Moderated

    I replaced the Intel RAID Controller (see original post), because I thought the Controller was causing the crashes. So no MR_Monitor events.

    However the Adaptec RAID Controller suffered from the same problem.

     

    Now we know this was due to the Firmware issues: Intel TA: TA-933-1
    “The frequency of the failure under normal operating conditions is once in 1 to 12 months depending on the system configuration. The failure may occur with high probability during system FRUSDR update.”
    A ‘little’ lie: our system 1-14 days between crash!!!!!!! So more than 75 in a year!

    Sorry, but I am really annoyed how about this.(I reported ‘disk’ problems at the beginning of 2008).

     

    I can't rule out that all issue are solved.

    The missing drive cage seems to be solved.

    However the system has still serious problems:

    Enabling the Adaptec Agent (comparable with MR_Monitor), crashes the system.

    Adaptec said, this could be related to a firmware issue.

     

    I am in contact with an Adaptec engineer trying to solve this issue.

     

    The last crash led to the un-ability to Back.

    Trying another Backup software (Acronis), led to even more problem, now the NICs have disappeared (and Adpatec can’t use VPN to access the system)

     

     

    Fulco

  • 9. Re: Two years of trouble with S5000PSL
    pjm0 Community Member
    Currently Being Moderated

    That is nuts. Thanks for the info. If I wouldn't have found this post I'd think I was losing my mind.

     

    About 5 months ago I built 3 servers with Identical parts. One server was a mess with these issues and the other two were fine. After replacing all of the parts and spending hours on the phone with Intel (they kept telling me they had never seen this before) I finally gave up on the one and just installed the two good ones. I installed both and one has never had any issues but the other one just started to fail a week ago after working fine for 4+ months. When calling Intel last week they never mentioned TA-933-1. I just updated it on Friday but I guess I'll have to baby this server because I won't ever be sure it is right.

     

    Thanks again. I really hope you solve your issues. I seem to have more luck finding answers on sites like this then from actual tech support now.

  • 10. Re: Two years of trouble with S5000PSL
    Zirafarafa Community Member
    Currently Being Moderated

    Me too 

     

    I have two of these servers, both running Linux and both giving same errors - crashing, drives missing after restart, rmm failing.

     

    I have left one turned off!!  for the last few months.  I will power it up again and try to get all updates done (Including TA-933-1) to see if I get any further with it.

  • 11. Re: Two years of trouble with S5000PSL
    Fulco Community Member
    Currently Being Moderated

    Since Intel published firmware version 2.12, there haven’t been any crashes!

    Only in version 2.12, SATA drives are seen as 1.5 Gb/s (not 3.0 Gb/s) drives.

    Version 2.14 fixed this.

    Fulco

  • 12. Re: Two years of trouble with S5000PSL
    Zirafarafa Community Member
    Currently Being Moderated

    Hmmm

     

    Even after applying all driver and firmware updates, I am still getting errors:

     

    Feb 26 17:19:17 storage-test-test MR_MONITOR[7588]: <MRMON181> Controller ID: 0  Enclosure shutdown: Ports 4-7:1

    Feb 26 17:22:22 storage-test-test MR_MONITOR[8213]: <MRMON113> Controller ID: 0  Unexpected sense: PD = :16 - Enclosure services unavailable, CDB =  0x1c  0x
    01  0x0e  0x14  0x00  0x00 , Sense =  0x70  0x00  0x02  0x00  0x00  0x00  0x00  0x0a  0x00  0x00  0x00  0x00  0x35  0x02  0x00  0x00  0x00  0x00

     

    Does anyone have any insight into this?

  • 13. Re: Two years of trouble with S5000PSL
    Zirafarafa Community Member
    Currently Being Moderated
  • 14. Re: Two years of trouble with S5000PSL
    Fulco Community Member
    Currently Being Moderated

    http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=18308&lang=eng

     

    I don't know if you are experiencing the same problem: on our system there where never any crash-dumps, or other log errors.

    The system did just froze and was unable to restart (see original post).

1 2 Previous Next

More Like This

  • Retrieving data ...

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points