3 Replies Latest reply on Jul 29, 2011 8:32 PM by koitsu

    Intel RST verifies 'blocks with media errors' but does not fix them

    DouglasWIlliamSmith

      Intel RST verifies 'blocks with media errors' but does not fix them.

       

      What  will 'fix' media errors? If they cannot be fixed, will reformatting the  matrix and reinstalling Windows format and install 'around' the media  errors? Or will it be necessary to get new SDDs (and if so, how do I  found out which one has the media errors?).

       

      RST version: 10.6.0.1002

       

      Raid 0: 2x256GB Corsair SDDs (C300-CTFDDAC256MAG) (system boot array)

       

      The  drive has a healthy status and no verification errors, but 2 'media  errors' (which persist after repeated verifications and restarts)

       

      The computer is prone to blue screen crashes which minidumps indicate are repeatedly caused by ntoskrnl.exe

       

      I assume the file ntoskrnl.exe is 'a bit' corrupt (perhaps part of it is sitting in the blocks with media errors?)

        • 1. Re: Intel RST verifies 'blocks with media errors' but does not fix them
          koitsu

          The term "media error" is too vague.  I have to assume this indicates that an LBA could not be read or written to.  With SSDs, this indicates an LBA on one of your SSDs which may be failing (e.g. the SSD is going bad; wear levelling cannot fix this issue).  Since you have a RAID-0 array, you will lose all of the data on this array if one of your SSDs has to be replaced.  I hope you do backups!

           

          "The drive is healthy and no verification errors" means absolutely nothing to me.  I don't trust highly-abstracted software under any circumstance.

           

          Since your drives are in a RAID array, this makes getting SMART statistics from them extremely difficult.

           

          I can try to step you through how to use the latest version smartmontools (which has experimental/alpha support for Intel RST/MatrixRAID) to get SMART statistics from your two SSDs.  That could help determine which SSD is experiencing problems, but no guarantees.  Be aware you will need to become familiar with command-line utilities (Command Prompt / cmd.exe) and follow very specific instructions.  If you're using Windows 7, you'll need to make sure you run Command Prompt as Administrator every time.

           

          Please let me know if you want to proceed with this method.  If you think I'm talking out my rear, I can give you some real life examples (links) where I've spent weeks working with people on forums to determine if their drives are in fact bad and track down the problems with their MHDDs piece using LBA scanning features of SMART.  SSDs are a little different in this regard though, so you'll have to cut me some slack.  I'll need to know exactly what OS you're running and if it's 64-bit or not.

           

          Otherwise, you're going to have to hook each SSD up to a separate computer's (non-RAID-mode) SATA port and get SMART data that way (using common tools like smartmontools (recommended), HD Tune Pro, or Speccy.  There are a multitude of other SMART utilities out there which do not show full attribute data, and those utilities should be avoided).

           

          Finally, be aware "media errors" can happen on any kind of drive -- mechanical or SSD.  Anything can go bad.  Using RAID-0 means you're living excessively dangerously, so you may want to consider getting rid of the RAID aspect to minimise future failure impact.  But on mechanical HDDs, a lot of the time "bad blocks" aren't really bad -- they've just been marked suspect by the drive firmware due to repetitive re-reads being required and similar.  A write to those LBA(s) can get the LBA re-analysed and either marked usable or force a remap.  With SSDs, it's totally different.

          • 2. Re: Intel RST verifies 'blocks with media errors' but does not fix them
            DouglasWIlliamSmith

            Yes, I would like to get SMART data about my SDD's 'in situ' (that is while still in RAID matrix).

             

            I backup to a non-raid HDD regularly.

             

            The pc crashes constanty anyway for reasons I cannot identify in the event log so I am ready to simply reinstall fresh. I just want to know whether the reinstall using my existing SDDs or whether I need a new one!

             

            = Doug

            • 3. Re: Intel RST verifies 'blocks with media errors' but does not fix them
              koitsu

              I'm going to assume you're not familiar with Windows Command Prompt, so excuse my simple instructions if you're already familiar with CLIs.

               

              1. Please download smartmontools 5.41 and install it.  Be aware in advance that this is a command-line (CLI) tool, not a GUI tool, so you'll be spending a lot of time in the Command Prompt.  Also be aware you cannot harm any data on your SSDs, or your system, using this utility -- especially with what we're doing below.  It's all non-invasive.
              2. If using Vista or Windows 7, launch a Command Prompt as Administrator.  If using XP, just run Command Prompt.
              3. Now comes the tedious part: you get to try and figure out what the proper device string is for your drives when behind Intel RAID.  The device string is /dev/csmiX,N but many users have found /dev/csmiX is sufficient.  X and N should be numbers starting at 0 and they're independent of one another.  You have two SSDs, so ideally the device strings would be /dev/csmi0 and /dev/csmi1, but it could also be /dev/csmi0,0 and /dev/csmi0,1, or even /dev/csmi0,0 and /dev/csmi1,0.  I simply don't know.  Please try combinations until you find both SSDs (to distinguish the difference between the two, look at the Serial Number or the LU WWN Device ID).
              4. To view the SMART attributes (assuming the device is correct), use the command:
                smartctl -a DEVICE

                where DEVICE is described in step 3 above.
                1. Assuming you find a match and don't see an error, you'll get a screen full of SMART attributes.  Command Prompt is generally a disppointing utility with no real easy way to copy-paste while scrolling the window.  So if you want to see if you've found a drive quickly without lots of data on the screen, you can try running:
                  smartctl -a DEVICE | findstr "Model Device Serial WWN"
                  (the pipe symbol and the quotes are both needed!) which should get you just certain lines of the output.
                2. Once you've found both your SSDs, write down / take note of what the DEVICE string is that you used for both SSDs.  That way you won't have to go through the above rigmarole again, as long as you don't add/remove devices to/from RAID.
                3. Finally, let's get full SMART data and store the output in two files (one for each SSD) which you can load into Notepad or attach here to the forum directly so I can review them.  For each SSD, do:
                  smartctl -a DEVICE > C:\ssdX.txt
                  where DEVICE should be obvious by now, and X is a number to distinguish the SSDs (e.g. first SSD could be ssd0.txt, second SSD could be ssd1.txt).  You get the idea.  You won't see any output on the screen when you execute this command, but you should find .txt files in C:\ which have the SMART attributes in them.
                4. You can type exit to exit the Command Prompt.
                5. Once I can review that data we can proceed further.  I'd prefer you upload the .txt files somewhere and provide links to them, rather than copy-paste, since the forum might mess up the formatting (makes it much harder to read and increases the likelihood of me making a mistake).  If you do copy-paste, please change the output to use the Courier font.  Also, please do not uninstall smartmontools or otherwise during this time; you'll be needing it again if we need to do a selective LBA scan (to test or find bad LBAs, if there are any).

                 

                An example session (for a directly-connected disk (no Intel RAID), which is why the DEVICE string is different):


                C:\Documents and Settings\jdc>smartctl -a /dev/sda | findstr "Model Device Serial WWN"
                Model Family:     Intel 510 Series SSDs
                Device Model:     INTEL SSDSC2MH120A2
                Serial Number:    LNEL123100QS120CGN
                LU WWN Device Id: 0 1507a5 1e26ba7c7
                Device is:        In smartctl database [for details use: -P show]
                Device does not support Selective Self Tests/Logging


                C:\Documents and Settings\jdc>smartctl -a /dev/sda > C:\ssd0.txt

                 

                C:\Documents and Settings\jdc>exit


                Good luck.