8 Replies Latest reply on Oct 30, 2017 1:03 PM by Intel Corporation

    Which disk is in trouble?

    JerrySNB

      I have a two-disk RAID 1 array. My system started behaving very badly, and I suspected that there was a problem somewhere in the disk subsystem. Windows 10 resource monitor showed that the disks were active 100% of the time, and the graph had a very peculiar pattern, even if there weren't any processes particularly busy. Response times were going up to over a second, and applications were not responding.

       

      The problem is that the RST software reported that everything was fine. I couldn't find any way to diagnose the problem until suddenly one of the drives fell over dead. The volume is now degraded, but system performance is back to normal. Now that the drive is junk, I can tell which one it is.

       

      My question is how could I tell a drive was failing, and how could I tell which one?

        • 1. Re: Which disk is in trouble?
          Intel Corporation
          This message was posted on behalf of Intel Corporation

          Thank you very much for contacting the Intel® Rapid Storage Technology  communities. We will do our best in order to assist you with this scenario.
           
          In regard to your inquiry, to find out which is the defective hard drive, you can always access the “control + i” menu by pressing those keys repeatedly when the computer is starting up. Once you do that, you will see each hard drive and the status of it, there should be an error message next to the disk that is not working and also you will be able to see the serial number of it to identify it.
           
          Any further questions, please let me know.
           
          Regards,
          Alberto R
           

          • 2. Re: Which disk is in trouble?
            JerrySNB

            I understand what you're saying, but that doesn't solve my problem. The volume was reported as normal, because both drives were still working; but one of them was in serious trouble. I could tell because the disk activity (percent busy, queue length, response time) was not right. The percent busy was 100%, the queue length was sometimes as high as 50, and the response times often reached 1000 ms or more.

             

            Once the drive stopped working completely, the volume went into a degraded state and I could see which drive had died. It took a week for this to happen, during which time the system was nearly unusable.

             

            How could I have told which drive was getting ready to fail? Neither the RST software, BIOS, nor controller reported a problem.

            • 3. Re: Which disk is in trouble?
              Intel Corporation
              This message was posted on behalf of Intel Corporation

              Thank you for providing those details. Normally the way it works is that the firmware of the hard drive reports to the Intel® RST tool if there are any problem with the health status of the hard drive. Sometimes, like in this case, if the firmware does not report inconsistencies on the hard drive then the tool will not show any errors or problems with the RAID structure.

              As an option, you can always check with the manufacturer of the hard drive if they have a tool or application to monitor the health of it.

              We apologize for any inconvenience.


              Any questions, please let me know.

              Regards,
              Alberto R

              • 4. Re: Which disk is in trouble?
                N.Scott.Pearson

                I thought that RST was regularly polling S.M.A.R.T. data from the drives and could generate an alert if something was wrong????

                • 5. Re: Which disk is in trouble?
                  Intel Corporation
                  This message was posted on behalf of Intel Corporation

                  : Yes, correct, but that S.M.A.R.T. data is generated by the firmware of the hard drive.
                   
                  Regards,
                  Alberto R  
                   

                  • 6. Re: Which disk is in trouble?
                    JerrySNB

                    I tried looking at the S.M.A.R.T. information in my system's BIOS, but that shows the raw data.

                     

                    By the time I realized what was happening, it was too late to download anything. The system was too unresponsive. I was doing everything through my phone. I suppose I could have downloaded a utility to my phone and then moved it over to my PC.

                     

                    In any case, I think we've taken this as far as we can.

                    • 7. Re: Which disk is in trouble?
                      JerrySNB

                      Sadly, it doesn't look like Western Digital's software can see past the RAID controller; so that's no help.

                      • 8. Re: Which disk is in trouble?
                        Intel Corporation
                        This message was posted on behalf of Intel Corporation

                        Thank you very much for providing those updates. We are sorry to hear that the configuration does not work as expected and as you mentioned, from our side we did our best we could to provide the information you were looking for.

                        Any questions, please let me know.

                        Regards,
                        Alberto R