12 Replies Latest reply: Oct 19, 2012 4:07 PM by Craig Duttweiler RSS

RAID5 Inaccessible after BIOS Reset to Default

leggy8 Community Member
Currently Being Moderated

Long story short, Windows locked up so I forced restarted where I was prompted that the BIOS settings were unstable so I reset them back to default.  From that point on, Windows no longer recognized my RAID5 (3x 1TB hard drives).

 

I made sure the BIOS settings was set to "RAID" (versus "IDE" or "AHCI").  When Windows booted up and I went into Disk Management, it tried to Initialize Disk so I tried as "MBR" but I got an error message.  Should I try "GPT"?

 

Windows and Intel Matrix Storage Console definitely detect the hard drives and I believe there is nothing wrong with them (i.e. they are not dead or corrupted).  In the Console, it thinks the RAID Hard Drives are "Missing..."  Instead, the two hard drives are listed in another section in the Console as "Non-RAID Hard Drives".

 

 

Should I try (1) deleting the RAID volume and then (2) re-creating it in the Intel Matrix Storage Manager ROM (CTRL+I during bootup)?  Will I lose the data that way?  The bootup definitely detects them but as a "Non-RAID Disk".  For status is says "Failed" and "No" for bootable.  Obviously it's failed since it looks like it doesn't detect the other 2 of the 3 hard drives as part of the array.

 

 

Motherboard: MSI P35 Neo2-FR LGA 775 Intel P35 ATX Intel Mother

Chipset: Intel ICH9R

OS: Windows 7 Professional

 

intel_matrix_storage_console.png

 

Note: this actually happened to me before.  I don't remember if it was because of a situation similar to this or when I upgraded from Windows XP to Windows 7 (probably the latter).  I can't recall what I did to fix it but I think it was changing the BIOS to "RAID" and initializing the disk in Disk Management.

  • 1. Re: RAID5 Inaccessible after BIOS Reset to Default
    Diego_Intel Community Member
    Currently Being Moderated

    Hello,

     

    The problem here is that the metadata on the disks which holds the configuration information of each hard drive's role in the RAID is corrupted, so the disks are not detected as RAID units and states that the volume is missing two drives.

     

    Creating a new raid volume with those drives will write new metadata on those hard drives and yes, it will delete all information on them.

     

    As you may know, Intel(R) Rapid Storage Technology does not provide recovery options for failed RAID 10, 5 or RAID 0 volumes, in this case the data is not accessible and cannot be recovered.

     

    At this point I would suggest you to check for a third party recovery tool that could retrieve the information from these hard drives or check with your motherboard manufacturer to se if they provide any utility for that.

  • 2. Re: RAID5 Inaccessible after BIOS Reset to Default
    Zyprexa Community Member
    Currently Being Moderated

    Please follow up if you find a tool that fixes this, I have the same problem (only with 6 drives, 4 found, 2 showing up as non-raid.)

  • 3. Re: RAID5 Inaccessible after BIOS Reset to Default
    leggy8 Community Member
    Currently Being Moderated

    I haven't recovered the data yet but you can try Runtime's software.  I used their GetDataBack program before and it worked like a charm.

     

    For their RAID recovery program, you may need to set your BIOS settings to AHCI since Windows will "hide" the RAID drives.  If you're using a RAID controller, then you'd want to plug them into the normal hard drive slots.

  • 4. Re: RAID5 Inaccessible after BIOS Reset to Default
    Zyprexa Community Member
    Currently Being Moderated

    Unfortunately, the array contained a fully encrypted (truecrypt) filesystem.  Because of this simply recovering individual files isn't an option, I need to recover the array itself so it can be mounted, if only read only, to copy the files off.

     

    For my day job, I work with Enterprise level RAID hardware that separates the array creation and initialization processes, such that if you re-create an array with exactly the same parameters, it will simply create the array structure whilst leaving the data intact.  One must then initialize the array to wipe out any old data.  What I'm looking for is the equivalent for Intel's RAID setup.

  • 5. Re: RAID5 Inaccessible after BIOS Reset to Default
    Zyprexa Community Member
    Currently Being Moderated

    I'm thinking I can maybe re-write the RAID structure without wiping out the data using mdadm.

     

    http://download.intel.com/design/intarch/PAPERS/326024.pdf

     

    Later versions support ICHx arrays.  This is great news.  Now all I need to do is wait for hard drive prices to go back to normal so I can pick up 6x1TB drives and mirror everything before I start attempting the recovery.  If I have any success I'll be sure to follow up.

  • 6. Re: RAID5 Inaccessible after BIOS Reset to Default
    scottdavis Community Member
    Currently Being Moderated

    Please do followup as I've had this happen to me twice.

    Fortunately, I keep two RAID5s, and sync them, so when this happens I just recreate the array that failed and re-sync it with the good one.

    Anyone who thinks that their data is safe on ONE Intel RAID5 array is nuts.

    Yes, you're protected from one hard drive failure, but what you discover is you're not protected from BIOS/metadata failure, and both times the array failed on me was when power was suddenly lost - once via an unexpected bluescreen/reboot and the other time though an actual power failure.

    I now keep both RAID5 machines on battery backup, but that does not protect me from things getting messed up by bluescreen/reboots.

    IMHO, this is a gaping hole in the stability of this platform, and if there were simply a tool to "set things right" when 2 or more disks suddenly leave the array (causing irretrievable failure) I could sleep a lot better at night.

  • 7. Re: RAID5 Inaccessible after BIOS Reset to Default
    Zyprexa Community Member
    Currently Being Moderated

    I haven't forgotten about this, but the data is too important to mess with without first duplicating the array[1], and, 6TB of disk is too expensive to buy still with prices being more than double what they were 3 months ago.  Until drive prices go back down to what they were in September 2011 the array and the server that ran it are going to sit in storage.

     

    [1] The irony that I won't mess with it until it's backed up, although I didn't have a backup prior to it going down, isn't lost on me ... ლ(ಠ_ಠლ)

  • 8. Re: RAID5 Inaccessible after BIOS Reset to Default
    Zyprexa Community Member
    Currently Being Moderated

    I have disk.. let the recovery begin...

  • 9. Re: RAID5 Inaccessible after BIOS Reset to Default
    Zyprexa Community Member
    Currently Being Moderated

    So.. I didn't forget about this.  I just pushed it aside until I could dedicate a lot of time to it.

     

    First, I *believe* I've been able to recover my array.  I say believe because it's currently rebuilding, and if you remember, it's a truecrypt'd disk.. but I have a bad feeling that I've now forgotten the password for it

     

    As I speculated before, I was able to use Linux+mdadm to achieve this.  I downloaded the Fedora 17 Live CD and booted in to that.  This gives you Linux and the RAID tools.

     

    ** I STRONGLY RECOMMEND YOU MAKE A DUPLICATE OF ANY AND ALL DRIVES BEFORE ATTEMPTING ANYTHING I'M ABOUT TO WRITE - FOR ALL I KNOW I'VE DESTROYED MY DATA AND I HAVE NOT YET DISCOVERED THIS **

     

    My array consists of 6 1TB drives in a RAID 5 config on an Intel ICH9 Firmware RAID setup.  In Linux the 6 drives showed up as /dev/sda, /dev/sdb.. to /dev/sdf.

     

    First we need to examine each drive to make sure the OS can actually read them.  Use the mdadm --examine command like this:

     

      mdadm --examine /dev/sda

     

    .. then b,c,d,e,f.  This is what my /dev/sda returned:

     

              Magic : Intel Raid ISM Cfg Sig.

            Version : 1.3.00

        Orig Family : 72b2e8ce

             Family : 72b81d8a

         Generation : 00129b86

         Attributes : All supported

               UUID : 8703e88d:b9cacc28:47437ab7:11aab7b3

           Checksum : 437acd3a correct

        MPB Sectors : 2

              Disks : 6

       RAID Devices : 1

     

      Disk04 Serial : 9VP0Z8Q8

              State : active

                 Id : 00000000

        Usable Size : 1953518862 (931.51 GiB 1000.20 GB)

     

    [Volume_01]:

               UUID : ee57e113:17d95751:6665358e:b646fbfa

         RAID Level : 5

            Members : 6

              Slots : [_UUU__]

        Failed disk : 5

          This Slot : 4 (out-of-sync)

         Array Size : 9767485440 (4657.50 GiB 5000.95 GB)

       Per Dev Size : 1953497352 (931.50 GiB 1000.19 GB)

      Sector Offset : 0

        Num Stripes : 15261696

         Chunk Size : 64 KiB

           Reserved : 0

      Migrate State : idle

          Map State : failed

        Dirty State : clean

     

      Disk00 Serial : D-WCASJ0412866:0

              State : active

                 Id : ffffffff

        Usable Size : 1953518686 (931.51 GiB 1000.20 GB)

     

      Disk01 Serial : WD-WCAU40247947

              State : active

                 Id : 00010000

        Usable Size : 1953518862 (931.51 GiB 1000.20 GB)

     

      Disk02 Serial : WD-WCAU40183694

              State : active

                 Id : 00030000

        Usable Size : 1953518862 (931.51 GiB 1000.20 GB)

     

      Disk03 Serial : WD-WCAU40338897

              State : active

                 Id : 00040000

        Usable Size : 1953518862 (931.51 GiB 1000.20 GB)

     

      Disk05 Serial : D-WCAU44808263:0

              State : active

                 Id : ffffffff

        Usable Size : 1953518686 (931.51 GiB 1000.20 GB)

     

    From this you can see all the information we need to build the array even though it thinks 3 of the disks are bad.  /dev/sdb,sdd and sde showed similar info but /dev/sdc and /dev/sdf did not, they showed this:

     

    /dev/sdc:

       MBR Magic : aa55

    Partition[0] :   4294967295 sectors at            1 (type ee)


    The RAID meta data (superblock) appears to be missing.  Luckily, we can fix this.


    After much digging I found this page - 'Re: Problem recovering failed Intel Rapid Storage raid5 volume' - MARC - which helped me a lot.  To break it down, this is what I did:


    First enable mounting of dirty arrays:


    echo 1 > /sys/module/md_mod/parameters/start_dirty_degraded


    Second, (re)-create the IMSM container that holds the array:


    mdadm -C /dev/md/imsm -e imsm -n 6 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf


    The "-n 6" is because I have 6 drives, /dev/sda - /dev/sdf.  Change that if you have more/less.


    Next, (re)-create the RAID 5 array in the IMSM container:


    mdadm -C /dev/md0 -l5 -n6 -c 64 /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf


    "-l5" means RAID 5, "-n6" again means 6 devices and "-c 64" is the chunk size used.  You saw this when you ran "mdadm --examine" on one of the good drives.  It looks like this:


         Chunk Size : 64 KiB


    As you execute these commands you'll be told that the bad disks appear to contain RAID data and asked if you want to continue anyway.. say yes, because, you backed up.. right?   If you say no, nothing happens.


    Once you're done the array will come on-line and start to resync.  Check on it by typing:


      cat /proc/mdstat


    It should look something like this:


    [root@leah ~]# cat /proc/mdstat

    Personalities : [raid6] [raid5] [raid4]

    md0 : active raid5 sdf[5] sde[4] sdd[3] sdc[2] sdb[1] sda[0]

          4883796992 blocks super external:/md127/0 level 5, 64k chunk, algorithm 0 [6/6] [UUUUUU]

          [====>................]  resync = 22.3% (218174848/976759424) finish=165.9min speed=76172K/sec

        

    md127 : inactive sdf[5](S) sde[4](S) sdd[3](S) sdc[2](S) sdb[1](S) sda[0](S)

          6630 blocks super external:imsm

         

    unused devices: <none>

     

    If it does.. then you are close to celebrating.  But first, try to mount the filesystem on the array.  We'll do this read only for now:


      mount -o ro /dev/md0 /mnt


    If all went well you *should* be able to now see files on /mnt...  try:


      ls /mnt


    See files?  Good.. now what you do is up to you.  Personally I'm going to leave my system alone now until it's finished rebuilding the array.  Once it's done I should then be able to reboot back in to windows and access it normally.  To watch/wait for this to happen enter:


      watch -d cat /proc/mdstat


    And then.. watch!  You may also want to unmount your file system so it stays clean.  Do:


      umount /mnt


    And then go back to watching until the resync hits 100%.  Once the resync is fully complete you can disable the array before rebooting with:


      mdadm --stop /dev/md0


    Now.. reboot back in to windows.


    Best of luck!


    *** THIS IS DESCRIPTION OF WHAT I USED TO (PROBABLY) RECOVER MY OWN ARRAY - YOUR ARRAY MAY BE SO DIFFERENT THAT MY INSTRUCTIONS WILL BREAK IT FOREVER.  DO NOT FOLLOW MY INSTRUCTIONS UNLESS YOU ARE PREPARED TO LOSE ALL OF YOUR DATA AND NOT HOLD ME OR ANYONE BUT YOURSELF RESPONSIBLE FOR THAT. ***


    Once I'm back in windows, and once I finally get the volume mounted in truecrypt.. I *will* come back here and update this thread with my known success.  If I'm not back soon.. then maybe it didn't work out so well after all and you'd better be careful listening to me    


    EDIT: Several hours later, the array has finished rebuilding and.. this is a sight that I thought I'd never see again...


    http://i.imgur.com/4FxPq.jpg


    A happy, healthy array.  Yay!  Now, let's see if I can actually _use_ the data on it

  • 10. Re: RAID5 Inaccessible after BIOS Reset to Default
    Craig Duttweiler Community Member
    Currently Being Moderated


    Zyprexa: Thank you so much! You have just saved my life (or at least 4 TBs of my data).


    Intel: Thanks for nothing. This is pretty pathetic. If your cruddy chipsets are going to semi-routinely stomp on RAID meta-data then at least have the decency to let us re-create the meta-data in your own software and re-instantiate the RAID without overwriting all the data in the array!   Needless to say, I will be getting a real RAID solution (and sure as heck not from Intel) ASAP.


    Everyone: I followed Zyprexa's instructions and did manage to get my RAID back, though I did run into a couple snafus that I figured I should document here along with my workarounds. By no means am I a Linux authority, though, so if anyone else has better fixes for the problems I encountered, please feel free to correct mine!

     

    1. I'm not sure what SATA mode should be used in the BIOS when doing all this (AHCI, IDE, or RAID), but I used RAID.


    2. I'm not sure if it matters, but it seemed to me that when re-creating the array with mdadm, it would be best to list the drives in the same order as that in which the original RAID indexed them. To do so, you'll have to match up:

    • drive serial numbers
    • their indices in the RAID
    • their /dev/sdX locations

    mdadm -- examine will give you the mapping between the first two, but not the second. lshw is really handy for the second but doesn't seem to be included with fedora by default. To install it (after switching to su):

         yum install lshw

    and then get all the info you need to do this mapping with:

         lshw -class disk


    3. Initially, I couldn't execute Zyprexa's mdadm commands because it seems that Fedora had already "activated" my broken RAID (consisting of the one drive that still had any meta-data on it) and I therefore got this error:

         mdadm: device or resource busy

    To fix this, I did:

         cat /proc/mdstat

         mdadm --stop /dev/mdXXX

    (where mdXXX matches the value shown in mdstat). That seems to have released the drive so that I could run the mdadm commands to re-create the RAID. Alternately, maybe this wouldn't have been an issue in the first place if I hadn't followed the instruction to allow mounting dirty arrays? Is that really necessary before the mdadm commands? Or maybe it would be better to do it after? I'm not sure. I'm also not at all sure why I had this problem but it seems that Zyprexa did not.

     

    4. After the second mdadm command, my array did not automatically come back online. It showed up as "inactive" in /proc/mdstat and I couldn't figure out how to activate it. Eventually I rebooted and was happy to see that all the disks showed up in the BIOS as members of the RAID, but once back in Linux, the array still wasn't active. I then rebooted into Windows where it seems that the array is active, and all my data is accessible!  

     

    5. One more downside: The RAID is now "initializing" (which seems to just mean re-creating parity data and not overwriting underlying data) at an infuriatingly slow pace. For my 4TB RAID, it has completed just about 1% in nearly a day! What the heck, Intel. That's like 3 months to finish the re-build, and needless to say, I am once again very unimpressed by your crappy RAID. The chances of a disk failure sometime in those 3 months are very much non-zero. Maybe this would be faster under Linux than winXP 64? If someone could tell me how to get the array active there I'd be happy to try it.

     

    Many thanks again to everyone for all the info in the thread.

  • 11. Re: RAID5 Inaccessible after BIOS Reset to Default
    Zyprexa Community Member
    Currently Being Moderated

    Hey, I'm glad it worked for you!

     

    To answer some of your questions.

     

    1. BIOS needs to be in RAID mode.  I tried AHCI, thinking mdadm would prefer direct access to the disks, but in AHCI you can't manipulate the RAID.

     

    2. I didn't have the old order on hand to do that, but since mdadm --examine seemed to know which disks were in the array and what their serial numbers were, I just left it to figure out what disk was what.  I would strongly recommend trying hard to make sure the disks stay in the same order, but I wouldn't discount letting mdadm just figure it all out.  Once I'm done recovering my data I plan to refine this procedure some more, I'll experiment with what I can get away with and report back.

     

    3. I think that's my bad, I would have mdadm --stop'd before doing any of the other commands.  I think the time difference between that and getting a working array was so big I'd just forgotten I did that.

     

    4. The echo to 'start_dirty_degraded' is what you needed to bring it immediately on-line.  Until it's 100% rebuilt it's considered degraded.

     

    5. If you allow Linux to bring the array up, it seems to rebuild MUCH faster.  I left mine running overnight and it was done almost exactly as I checked it first thing next morning.  By that I figure it took about 6 hours, and that's for 6 1TB drives in RAID 5.

     

    Given the length of your rebuild time I'd recommend moving the data off the array and wiping/re-creating it from scratch - much faster

     

    Now, hurry up and set up that new backup solution... I just recently signed up with BackBlaze, which offers unlimited encrypted cloud backup storage for $5/month - less if you pay in advance, and it's fully automated.  Once installed you never have to worry about your files being backed again.

  • 12. Re: RAID5 Inaccessible after BIOS Reset to Default
    Craig Duttweiler Community Member
    Currently Being Moderated

    5. I'd love to let Linux do it, but as i said, I can't figure out how to get the thing active in linux. I gather i still need to set "start_dirty_degraded", but that doesn't seem sufficient. Not sure why it automagically started for you and  not for me. Do i need to do an mdadm --assemble or something? I'm scared to do that since I don't know enough about this stuff to be sure that it won't have some awful side effect. (I'm doing this from a LiveCD, btw, so something like the start_dirty_degraded value isn't sticky from one boot to the next.)

     

    re: other solutions. Online backup is certainly one option I'm considering. I already use Mozy for a small amount of my truly important data. For this stuff, I'm also considering RAID6 on a "real" RAID controller, but haven't looked much into it yet.

More Like This

  • Retrieving data ...

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points