0 Replies Latest reply on Sep 16, 2013 1:08 PM by jj5406

    raid still reporting predictive failure after drive replacement


      I have a machine running centOS 5.3.  It has a 6-disk raid 5 array. According to the raid web console 2, The raid card appears to be SRCSASBB81.

      About a week ago, I started to receive these predictive failure warnings (once per day).



      Controller ID: 0  PD Predictive failure: --:--:4

      Generated on:Mon Sep 16 08:29:57 2013




      IP Address: REDACTED

      OS Name: Linux

      OS Version: 2.6

      Driver Name: megaraid_sas

      Driver Version:




      BIOS Version: 1.12.122-0393

      Firmware Package Version: 8.0.1-0029

      Firmware Version: NT16


      So, I started the intel raid web console, looked at all the drives and saw that drive 4 did have a "pred fail count" of 1.  All the other disks had 0 in that field.  I figured that's what the "--:--:4" in the warning was referring to.  I backed up everything on the raid, identified the physical location of all drives then using the raid web console took drive 4 off line (putting the raid into a degraded state).  The light on the physical drive in the expected location turned orange - as expected.  I removed the disk and replaced it with a new one.  The raid rebuilt and came back to optimal with the new disk.  All went as planned. Yay!


      However, every morning at 7:30 AM, I still get this same predictive failure warning.  The "pred fail count" on the new disk (like all the others) is now 0.  Everything looks fine.  Is there some file where I have to manually reset some failure count?  I can't see anything in the UI that indicates there is something else I need to do.


      Please help me understand what's going on and what further steps I should take