12 Replies Latest reply: Jul 28, 2011 7:31 AM by tomf RSS

SMART questions especially "Unsafe Shutdown Count"

tomf Community Member
Currently Being Moderated

I installed a new 510 series 120Gb SSD to my DH67GD in the last several days, and am alarmed to look at the SMART attributes via the SSD Toolbox, where it says ID C0 Unsafe Shutdown Count is 6. Six unsafe shutdowns? I have been babysitting this thing since birth, having installed W7 HP 64-bit from scratch, and I have yet to experience anything at all that would account for a number like this. No BSODs or power interruptions at all. What am I to think of this? I've in fact only shut the PC down completely a couple of times, the rest of the time it's been placed into S3 Sleep mode.

 

Anybody?

 

Another question: it was suggested in this thread http://communities.intel.com/message/106050#106050 that SMART should be turned-OFF in BIOS, but then refers to a link where contradictory advice is given. For the DH67GD, with an SSD and the Toolbox, should I turn OFF SMART in BIOS?


  • 1. Re: SMART questions especially "Unsafe Shutdown Count"
    mechbob Community Member
    Currently Being Moderated

    Ithink this is a software Bug , I run my computer on a High End UPS and always shut down the computer right and mine says I have 42 unsafe shutdowns . which I know is not true , and I am the only person that touch this machine.

  • 2. Re: SMART questions especially "Unsafe Shutdown Count"
    koitsu Community Member
    Currently Being Moderated

    The first question I have is: did the drive have SMART attribute 0xC0 (Unsafe_Shutdown_Count) as zero (0) when you purchased it, and it has increased to 6?

     

    I have a very, very hard time believing this is a bug.  Most consumers/end-users do not understand how the ATA protocol works, nor what "unsafe shutdown" actually means.  It is the responsibility of either the operating system or the underlying storage driver to submit the ATA command 0xE0 (STANDBYE IMMEDIATE) right before the OS is to shut down (power off) the system.  This should not happen on a reboot.  Quoting ATA8-ACS2 specification, section 7.55:

     

    7.55 STANDBY IMMEDIATE - E0h, Non-data
    7.55.1 Feature Set
    This commands is mandatory for devices that implement the Power Management feature set
    7.55.2 Description
    This command causes the device to immediately enter the Standby mode.

     

    Alternately -- and I would need to discuss this with an Intel engineer to be certain -- one should be able to submit ATA command 0xE1 (IDLE IMMEDIATE) with the Unload feature bit set.  Again, quoting ATA8-ACS2 specification, section 7.19:

     

    7.19 IDLE IMMEDIATE - E1h, Non-data
    7.19.1 Feature Set
    This command is mandatory for devices implementing the Power Management feature set.
    7.19.2 Description
    7.19.2.1 Default Function
    The IDLE IMMEDIATE command allows the host to immediately place the device in the Idle mode. Command completion may occur even though the device has not fully transitioned into the Idle mode.

    7.19.2.2 Unload feature
    The optional UNLOAD feature of the IDLE IMMEDIATE command provides a method for the host to cause a device that is a hard disk drive to move its read/write heads to a safe position as soon as possible. Upon receiving an IDLE IMMEDIATE command with the UNLOAD feature, a device shall:
    a) stop read look-ahead if that operation is in process;
    b) stop writing cached data to the media if that operation is in process;
    c) if a device implements unloading its head(s) onto a ramp, then the device shall retract the head(s) onto the ramp;
    d) if a device implements parking its head(s) in a landing zone on the media, then the device shall park its head(s) in the landing zone; and
    e) transition to the Idle mode.
    The device shall retain data in the write cache and resume writing the cached data onto the media after receiving a Software Reset, a Hardware Reset, or any new command except IDLE IMMEDIATE with UNLOAD feature. A device shall report command completion after the head(s) have been unloaded or parked.
    NOTE 11 — The time required by a device to complete an unload or park operation is vendor specific. However, a typical time for a drive to unload heads on to a ramp is 500 ms, and a typical time for a drive to park heads in a landing zone is 300 ms.

     

    I realise IDLE IMMEDIATE with Unload set looks like it's intended for mechanical HDDs -- technically it is.  Sure, SSDs don't have heads to park, but SMART attributes can be used to track all sorts of things.  There are many other references (and another) to my claims.

     

    If you think my claims are nonsense, please read the Intel Solid-State Drive Toolbox User Guide, section 3.4.2.6, for confirmation of my claims.

     

    So, basically your OS or storage subsystem driver isn't properly submitting 0xE0 to the controller prior to the system powering off.  Or the controller is ignoring the data submit to it.  Or the system powers off too quickly before the time the command was commit and the time the drive was able to process the command.  Again: this should not happen on reboot.  A kernel panic (BSOD), abrupt system power-off, or hard system reset (pressing the reset button on the system case) will cause this attribute to increment.

     

    Your next question will be: "So how do I verify what's going across the wire?  How do I debug this?"  The simple answer is: you can't without a SATA protocol analyser in-band (between the SSD and the controller).  You're just going to have to believe me.  :-)

  • 3. Re: SMART questions especially "Unsafe Shutdown Count"
    tomf Community Member
    Currently Being Moderated

    >mine says I have 42 unsafe shutdown

     

    I've searched High and Low and remain puzzled as to why Intel reports this, but doesn't explain "why" here or anywhere? It certainly is disconcerting to see isn't it...

     

    I did turn-off SMART in BIOS and it seems not to have affected the Toolbox at all, so apparently the BIOS feature is something that only affects startup, as if it were doing a SMART check on boot and would report if a problem, but I can find no confirmation for this for my DH67GD.

  • 4. Re: SMART questions especially "Unsafe Shutdown Count"
    tomf Community Member
    Currently Being Moderated

    koitsu you & I posted at the exact same time so I am now trying to absorb your comments (thus far, I do not understand! technically). Thanks for contributing and I hope an Intel engineer will jump-in to clarify.

     

    To your question, AFAICT it is entirely possible in my case that my USC of 6 was indeed on the SSD when I installed it. It was a few days before I noticed it in the Toolbox but in the hours since I did I have not seen any incrementing upwards (nor have I had any crashes or "unsafe shutdowns").

     

    In any case since I have an Intel mobo DH67GD and a sole Intel 510 series SSD w/no other hard drives installed, this is purely an Intel issue imo and they should be able to explain it...

  • 5. Re: SMART questions especially "Unsafe Shutdown Count"
    koitsu Community Member
    Currently Being Moderated

    tomf wrote:

     

    >mine says I have 42 unsafe shutdown

     

    I've searched High and Low and remain puzzled as to why Intel reports this, but doesn't explain "why" here or anywhere? It certainly is disconcerting to see isn't it...

     

    I did turn-off SMART in BIOS and it seems not to have affected the Toolbox at all, so apparently the BIOS feature is something that only affects startup, as if it were doing a SMART check on boot and would report if a problem, but I can find no confirmation for this for my DH67GD.

     

    You're misunderstanding what the PC BIOS option does.  It doesn't help that motherboard manufacturers incorrectly document what the feature does either, and that includes Dell.  Furthermore, this BIOS option has absolutely nothing to do with the topic of this thread.

     

    The BIOS option, when enabled, causes the BIOS itself to query all attached hard disks and submit the ATA command relevant to getting back the "overall SMART health status" value (literally a "drive is OK" and "drive is bad" result); it does not monitor all attributes, it simply looks at the overall health status.  If the drive returns a not-healthy status, the PC BIOS will pause/stop and inform you of this (indicating the drive may be bad, replace it, etc...).  If the drive returns an OK/healthy status, the PC BIOS will continue normally.

     

    When the BIOS option is disabled, the BIOS does not perform a SMART health check at all.  Disabling the option can slightly increase boot speed (maybe by 0.2 seconds AT MOST).

  • 6. Re: SMART questions especially "Unsafe Shutdown Count"
    koitsu Community Member
    Currently Being Moderated

    tomf wrote:

     

    ... In any case since I have an Intel mobo DH67GD and a sole Intel 510 series SSD w/no other hard drives installed, this is purely an Intel issue imo and they should be able to explain it...

    This is not "purely an Intel issue".  I can show you this same issue/behaviour on a classic MHDD when a machine doesn't properly set IDLE IMMEDIATE or STANDBY IMMEDIATE before shutting down.  :-)  I have all sorts of drives that behave like this (particularly ones from Seagate), intermittently too, on Windows.  Just because the system claims to be shutting down doesn't necessarily mean, in every situation, the drive is actually receiving the proper command before the system's power is terminated.  Like I said: without a SATA protocol analyser, there's no way to verify this and you'll just have to trust me.

     

    There's nothing to worry about anyway.  A non-zero RAW_VALUE for SMART attribute 0xC0 is completely acceptable.  SMART won't trip (start reporting "bad health") because the adjusted value (VALUE) hasn't reached THRESH (threshold).  For example, here's my Intel 320-series SSD:


    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

    192 Unsafe_Shutdown_Count   0x0032   100   100   000    Old_age   Always       -       15

     

    Do you see anything wrong here?  I see absolutely nothing wrong.  The drive has not been properly shut down a total of 15 times (the RAW_VALUE was 0 when I got this drive).  Can I account for all 15 times this happened?  Yes I absolutely can.  The most recent 9-10 are due to my workstation losing power abruptly.

     

    You need to learn how to read SMART attributes properly.  Would you like me to teach you how to read the above data correctly, or would you rather insist there's a problem that isn't there?  :-)

     

    EDIT: Here's a multitude of server drives I have which exhibit the same thing you claim is an "issue".  Note the different in models, and the fact they're not Intel SSDs.

     

    Here's a 3-disk system:

     

    Model Family:     Western Digital Caviar Black
    Device Model:     WDC WD1002FAEX-00Z3A0

    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       27


    Model Family:     Western Digital Caviar Black
    Device Model:     WDC WD2001FASS-00U0B0

    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       16


    Model Family:     Western Digital Caviar Black
    Device Model:     WDC WD1001FALS-00J7B1

    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       23

     

    Another system, with 2 disks (the 2nd disk doesn't track SMART attribute 0xC0, however -- it's too old of a disk):

     

    Model Family:     Western Digital RE2 Serial ATA
    Device Model:     WDC WD3201ABYS-01B9A0

    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       14


    Model Family:     Western Digital RE Serial ATA
    Device Model:     WDC WD2500YS-01SHB1

     

    Now a system that has 4 disks in it:

     

    Model Family:     Western Digital VelociRaptor family
    Device Model:     WDC WD3000HLFS-01G6U0

    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       8


    Model Family:     Western Digital Caviar Black family
    Device Model:     WDC WD1001FALS-00U9B0

    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       25


    Model Family:     Western Digital Caviar Black family
    Device Model:     WDC WD1001FALS-00U9B0

    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       11


    Model Family:     Western Digital Caviar Black family
    Device Model:     WDC WD1001FALS-00U9B0

    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       18

     

    And finally, one remaining system which DOES have an Intel SSD, as well as a classic MHDD.  Hey look at that, the Intel SSD and the MHDD have the same RAW_VALUE...  I wonder how that happened?  Could it be that the system lost power 9 times?  Hmm, imagine that.  But wait, the system is on a UPS, so how did it happen?  Simple: when I built this system, I did in fact force the power off manually a few times, and a couple other times I used MS-DOS to do BIOS and SSD firmware upgrades and shut the system off abruptly afterwards.  So basically all 9 times are legitimate.

     

    Model Family:     Intel X18-M/X25-M/X25-V G2 SSDs
    Device Model:     INTEL SSDSA2M080G2GC

    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

    192 Unsafe_Shutdown_Count   0x0032   100   100   000    Old_age   Always       -       9


    Model Family:     Western Digital Caviar Black
    Device Model:     WDC WD1002FAEX-00Z3A0

    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

    192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       9

     

    Starting to get the picture?

  • 7. Re: SMART questions especially "Unsafe Shutdown Count"
    tomf Community Member
    Currently Being Moderated

    >You're misunderstanding what the PC BIOS option does.  It doesn't help  that motherboard manufacturers incorrectly document what the feature  does either, and that includes Dell.  Furthermore, this BIOS option has absolutely nothing to do with the topic of this thread.

     

    Number 1, I did say "apparently the BIOS feature is something that only affects startup" so I already figured this out. And btw it's MY thread, and it is entitled SMART questions! I'll ask whatever the heck I want to ask, thank-you-very-much.

     

    >Would you like me to teach you how to read the above data correctly, or  would you rather insist there's a problem that isn't there?

     

    >Starting to get the picture?

     

    I don't know what your underwear is all in a bunch about! I never "insisted there was a problem", all I asked was for someone to explain what C0 meant. And then I merely added "It's an Intel mobo w/Intel SSD" so Intel certainly oughta be able to explain exactly what it's about.

     

    Dunno what your agenda is--nor do I care--but I'd still suggest to Intel that if they are going to expose this data to their end-users, they might consider expanding the explanation of same in their Help file.

  • 8. Re: SMART questions especially "Unsafe Shutdown Count"
    koitsu Community Member
    Currently Being Moderated

    The explanation they provide in the Toolbox User Guide is absolutely accurate.  The reasons for the attribute increasing in RAW_VALUE I explained in an earlier post.

  • 9. Re: SMART questions especially "Unsafe Shutdown Count"
    tomf Community Member
    Currently Being Moderated

    koitsu wrote:

     

    ...when I built this system, I did in fact force the power off manually a few times, and a couple other times I used MS-DOS to do BIOS and SSD firmware upgrades and shut the system off abruptly afterwards.  So basically all 9 times are legitimate.

     

    I did read the Intel User Guide--but it doesn't suggest real-world reasons why "unsafe shutdowns" might accumulate. Had you simply stated (just) the above in Post #2 you would have saved us both a lot of grief.

     

  • 10. Re: SMART questions especially "Unsafe Shutdown Count"
    koitsu Community Member
    Currently Being Moderated

    Because I get Email or find comments about this non-issue every month or two, I decided to make a Youtube video about it with a brand new Intel 510-series SSD.

     

    http://www.youtube.com/watch?v=NHcgitmP70w

     

    Keepin' it real like a sprayed snow tree...

  • 11. Re: SMART questions especially "Unsafe Shutdown Count"
    parsec Community Member
    Currently Being Moderated

    Very good explanation koitsu, even if it wasn't appreciated.  Rather than having an agenda, IMO koitsu was frustrated that tom did not get the point he was trying to make.  koitsu's explanation dove right in to the technical explanation, when what was also needed IMO was an overview of what SMART is and where it came from.  That would have given tom the general information that he also needs, IMO.

     

    Intel did not create SMART, nor do they own it or control it, it is used by most if not all computer data storage drive manufactures.  The SMART standards were created by the SFF (Small Form Factor) committee, which is composed of people from many companies, including Broadcom, Dell, Foxconn, HP, Hitachi, IBM, Intel, Molex, Pioneer Samsung, Seagate, Sun Micro, TI, Toshiba, and others.  The basic idea is simply to provide information about the health and usage statistics of a storage drive, usually a HDD or SSD, with the intent of allowing users to see if their drives are fine or are approaching or at the verge of failure.

     

    Hardware that uses disk drives like mother boards may have software that reads the SMART data from a drive and displays it.  The usefulness of the display and the format of the data is questionable, and as with anything as complex as a computer, one must learn to interpret the data in order to make meaningful conclusions.  In other words, if you don't truly understand it (I don't), glancing at it tells you nothing.  IMO, SMART is one of the worst "standards" since there seems to be no true standard, or at least one that most manufactures adhere to.  Add to that the fact that a manufacture can create their own SMART data attribute, whose numeric value must be interpreted by their rules.  If the reader or program that reads the data is not aware of the translation or interpretation, how do you use the data?  That's why you see the SMART table data column titled "Raw", which is the number read but not interpreted.  The reader is left to interpret the data, which takes effort and research.  That is not Intel's fault, that is just the nature of SMART (insert sarcastic comment about this being not-SMART here.)

     

    Regardless, it is a valid question why one would see a count beyond 0 or 1 in the Unsafe Shutdown attribute... or is it?  How many times during building and starting a PC do we have an "oops" moment?  More than we remember.  Also as koitsu states, who knows what conditions that we consider to be normal occurrences that are reported as unsafe to the SMART database?  I'm not smart enough to know that.

     

    The unsafe shutdown counts for my drives are all in the 10's - 40's, and I very generally know why that is the case.  They don't bother me at all.

  • 12. Re: SMART questions especially "Unsafe Shutdown Count"
    tomf Community Member
    Currently Being Moderated

    Glad parsec you have the spare time to read all this and make your own "War and Peace"-length post expressing your opinion.

     

    Some of us prefer "clear and concise"!

     

    Marking this "answered".

More Like This

  • Retrieving data ...

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points