4 Replies Latest reply: May 9, 2011 10:46 AM by Sunfox RSS

Intel 510 256gb SSD failure...

Sunfox Community Member
Currently Being Moderated

I have a fairly new system, built within the past couple of months, based on a DX58SO2 with an i7-990X. I used a 256gb 510 SSD as the system drive, and have a RAID 10 array for data. Since building it the drive has been fine, but on Friday night I began having a few issues with Windows acting a bit flakey (system is normally on 24x7), so I rebooted.

 

When it came back things felt... slow. Like I was using a normal hard drive instead of an SSD. But I didn't think much of it and went to bed.

 

Fast forward to Saturday night. That day's Windows Home Server backup failed (no error given), and over the past couple years I've learned that usually indicates that there's an error in the filesystem that needs correcting. Since the whole PC is still feeling slow, and I had an odd error in Windows Live Mail about a corrupt folder, I load up the SSD Toolbox to check the SMART status. No errors, so I schedule the drive for chkdsk, and reboot.

 

Chkdsk soon give me an error "File record segment #### is unreadable. There's 8, all in a row. It goes for a while, and finishes all files records, but before doing stages 2 and 3 it reboots. Now it finds more "unreadable" errors: 4 groups of 8 sequential segments. It finishes, and goes through stages 2 and 3, correcting a couple hundred of corrupt attribute values.

 

Reboots again - no Windows. It just spontanously reboots and wants to run Startup Repair. I let it go through the "repairing disk errors" stage for over 4 hours - it says it MAY take an hour. There's no progress indicator or list of things it's done, so I give up and try again. It just does Startup Repair over and over, whether I pick normal, safe mode or whatever.

 

So I move the drive over to another i7-based Windows 7 PC. It detects the drive as "RAW" file system and claims the drive or folder is corrupt or unreadable.

 

I run the SSD Toolbox's "quick scan". Read passes, data integrity hangs. Cancel that, try the "full scan" within a few minutes it comes back as failed - contact Intel. Good thing WHS has a backup from 2 days ago.

 

For fun I give chkdsk another go. There's now 40 unreadable record segments (another group of 8) and the rest of the errors number in the tens of thousands - a constant scrolling screen of errors for over an hour. I let it finish, but the system still sees it as a corrupt drive.

 

Since I wasn't sure whether the fact that the partition seems corrupt was at fault for the test failure, I delete the partitions, try a Secure Erase - works - and then try re-creating a single partition. No matter what, it just hangs trying to create the partition with the drive light on solid. Rebooted, tried a couple times, same issue.

 

SMART reports a total of 74 reallocated sectors. The drive is less than 60 days old and the reported writes were 1.73TB before I started, and 3.1TB after all of the chkdsk and system repair attempts (seems like a huge increase).

 

So... seems to me that it's good and dead. Any thoughts, since my weekend is now wrecked and I can't do anything till Monday?

  • 1. Re: Intel 510 256gb SSD failure...
    koitsu Community Member
    Currently Being Moderated

    It sounds to me like all the problems you were witnessing could be attributed to the drive having internal issues (particularly a large number of bad flash segments/blocks).  I would strongly suggest a full RMA on this drive.  Contacting Intel on Monday would be your best choice of action.  The SMART data is pretty damning in that regard.

  • 2. Re: Intel 510 256gb SSD failure...
    Sunfox Community Member
    Currently Being Moderated

    Thanks for the response. Seeing as it's generally non-responsive now (can see it but can't do anything to it) I'm definitely going to have to get it replaced. one way or another.

     

    I know that a "sector" is typically only a few kilobytes, but is 74 a bit high for such a new drive, especially a SSD? I've read reports of some people getting replacements just because they saw 3 or 10.

     

    I just looked at the reallocated sector count for my ancient RAID10 array of four 500GB Seagate 7200RPM hard drives. It's 0 for all four drives, with 32,232 power on hours.

  • 3. Re: Intel 510 256gb SSD failure...
    koitsu Community Member
    Currently Being Moderated

    You can't compare a classic mechanical hard disk to that of an SSD.  The technology and engineering/design are completely different, including the parts used.  A comparison to a mechanical HDD is like comparing a wooden baseball bat to an aluminium one; they both hit a ball but they're composed of completely different things (i.e. a wooden bat might break in half, while an aluminium bat may dent or bend).

     

    If you purchased the drive recently -- say, within a couple weeks -- and you're already seeing that number of reallocated blocks, then yes that is abnormal.  It almost certainly means that there's an area of NAND flash internal to the drive itself which is bad.  For comparison, I have a couple X25-V (40GB) drives that have been in operation as OS drives for 6 months and they respectively have 1 and 2 reallocated blocks.

     

    Your drive also appears to be having some sort of I/O-related issue (that is to say: certain ATA commands for obtaining drive model, serial string, etc. may work, but actual read/write attempts to a NAND-flash-based LBA will fail), which is a definite reason to get it replaced.

     

    Simple answer: RMA the drive or return it to your place of purchase for a replacement.

     

    Technical discussion below, stated here for purely educational purposes.

     

    The term "sector" is something of a legacy term at this point -- I really wish vendors would stop using it because nobody in the past 15 years uses C/H/S nomenclature.  But sadly lots of tools still refer to things as a "sector".  I suppose part of the confusion lies in the fact that the term "block" is used in all sorts of other pieces of a storage device as well (particularly filesystems), so you end up with people using the term "block" to refer to Thing X while another person uses it to refer to Thing Y, thus confusion.  Examples: if we were talking about NTFS filesystems, the term "block" might refer to "cluster size", which is not the same thing as a block ("sector") on a hard disk (SSD or mechanical).  Likewise, if we were talking about RAID-0, "stripe size" (which sometimes is labelled "block size") is also different/unrelated.  On UNIX or Linux systems, a filesystem "block" (or with some filesystems, cylinder group) also refers to something different/unrelated.  They're all independent terms, despite all the technologies interfacing with one another indirectly.

     

    A reallocated block on an Intel SSD refers to, most likely, a 4096 byte (4KByte) piece of NAND flash on the drive itself.  I could be wrong about the 4096 value; Intel doesn't disclose if a reallocated block (according to SMART) is 512 or 4096.  The reason I can't be sure is that ultimately "it depends" on how Intel implements SMART on their drives.  The tricky part is that drives with 4KByte blocks (such as Western Digital EARS-series drives) actually report their block size as 512 to the OS -- SSDs are no exception to this rule.  There are legacy reasons for this (lots of internal architecture/design pieces of a PC still assume that a "sector" is 512 bytes).  I'd rather not get into a long, even more technical discussion, about how OSes interact with a drive of this nature.

     

    "Reallocated" in this context means that the drive experienced internal problems when writing to that particular section of flash.  The write failed (reasons unknown), and after reattempts, the drive (internally) marked that section of flash as bad and wrote the data to another section.  It won't reuse that section again.  An incrementing number of reallocated blocks, or a number that gradually grows despite the drive being brand new, is usually an indicator that there's some bad flash on the disk and it should be replaced.

     

    A single error -- or in my case, a couple errors on one drive -- after many months of use can be considered normal, especially on MLC drives.

     

    NAND flash is not impervious to problems.  There is a common misconception that exists in the marketplace that SSDs are basically immune to failure; this is flat out false/wrong.  They're just as suceptible to failure as a classic mechanical HDD is, just that there's no chance of moving parts causing a failure in an SSD (since there are none :-) ).  Instead, electrical or flash-based problems are what you'll see.

     

    Furthermore, do not let an SSD replace your ability to do backups.  Same thing with RAID; RAID is not a replacement for backups.  Do backups, for situations exactly like this one.

More Like This

  • Retrieving data ...