1 2 Previous Next 21 Replies Latest reply: Apr 23, 2012 9:33 AM by john_s@intel RSS

SS4000-E :  Questions on pushing the limits

zebraitis Community Member
Currently Being Moderated

Hello,

 

 

BACKGROUND:  according to the support document at http://www.intel.com/support/motherboards/server/ss4000-e/sb/CS-022215.htm , the SS4000-E w/ firmware 1.4 v710  is stated to have a maximum capacity of 3TB.

 

MY QUESTION:  Is that a limitation of the software design of the firmware, or was that number based on hard drive sizes available at the time?

 

 

WHY I ASK:

 

I write this with the hope that one of the support members has tried this, or one of the community members may have tried this.

 

I bought several of these SS4000-E boxes, new from a reseller blowing them out at a very low price.

 

Disclaimer:  I understand that it's past end-of-sale and is end-of-support.  And, I understand the list of limited "supported" hard drives.  I have read the documentation available.  (So, I acknowledge that this is a "message-in-a-bottle" support request that may or may not be answered.)

 

That being said, it looks like the SS4000-E with 1.4 v710 firmware will work with 2TB drives.   (Notice that I did not say "supported" as they are not on the list of supported drives)

 

...at least, it LOOKS like it does when I plug 4 of them in them into the SS4000-E box, and set them up in RAID 5, and create 3 partitions (2 @ 2048 [1.99TB each] , one w/ remainder space [1.45TB])

 

So far, so good.

 

Then, since I do have these three partitions, I began restoring data to these drives.  (Using a free product "SyncBack", a third party program, and this is running in a Win7 home environment)

 

I filled up my 2TB public partition and started to add data to my 2TB public-2 partition... when that was done, I found that I received a "DISK CHANGE NOTIFICATION" message upon logging in.  I could not reach one of the partitions.

 

Well, I did not change the disk, and I could not tell if there was a failure, and I could not get past that screen.   I did verify that the disk sizes and serial numbers recognized by the Intel ss4000-e firmware were EXACTLY the same.  And, the disks on that page stated no errors

 

 

So, I reinitialized and restarted.   (Yes, I always have a back-up for my backup.  I am a technologist, I have no faith in any single-point solution).

 

 

This time, I stated restoring with "public-3" (w/ 391GB of data) and then "public-2" (w/ 1.1TB of data) ... everything seemed to be going well.  I stopped and started my restore several times along the way without problems.

 

 

However, when I got to the end of restoring the second partition, AGAIN, I found that I had received a "DISK CHANGE NOTIFICATION " error.

 

And, of course, again, no disk failure or change, with all lights on the box a happy shade of green.

 

 

I DID notice, this time that the solution said:  "Current state: RAID 5 (NORMAL, Resync : 73 %, Finish : 1873 min, Speed : 4540K/sec)"

 

Here is a screenshot, resync numbers have changed slightly:

 

aaaa.bmp

 

 

 

And, I can reach my mapped "public-2" and "public-3" partitions and see and use that data on those partitions, but I cannot reach my mapped system-created "public" partition.  (And, if I access the resource directly, I can access "admin", public-2", "public-3", but not "public".)

 

Of course, as I have the "DISK CHANGE NOTIFICATION" message screen, I cannot get past that to see if that partition still exists on the system or not.

 

 

So, THIS time, rather than reinitializing and starting over, I am choosing to let the RAID restore process run and see what may be the result in a day or two.  (if it will actually work or not)

 

 

But ultimately, I want to know:  Is the 3TB limit mentioned in the documentation something that may be causing the system (with 4 2TB Drives)  to believe it has had a failure?

 

... and will it ever work with 4 @ 2TB drives?

 

 

I hope for a response.  I will post my RAID restore results here in a few days time.

 

Thanks,

 

v.

  • 1. Re: SS4000-E :  Questions on pushing the limits
    john_s@intel Community Member
    Currently Being Moderated

    The large HDDs "should" work as the 1.4 version of firmware (operating system) supports storage capacity greater than 2 TeraBytes (TB). As you see, the storage is divided into 2TB partitions, including one for a shared public folder, one for user home folders and one for backups. However, Intel has never tested larger than 1 TB HDDs and don't know how they'll work.

     

    That said, the way you're configured should be fine. The public1, public2 and public3 shared folders will actually be separate partitions like /dev/vbdi4. vbdi5 and vbdi6 mounted on /nas/NASDisk-00004, NASDisk-00005 and NASDisk-00006. These partitions will each be 2TB.

    The different partitions are still part of the single RAID array. Unless there was damage to the specific sectors in this partition. Maybe damaged sectors on a specific disk and may explain the "one or more drives had either failed or been changed" message.

    You can troubleshoot the system by creating and a diagnostic file (XRay) for the SS4000 and analyzing the results. See:
    ftp://download.intel.com/support/motherboards/server/ss4000-e/sb/ss4000etroubleshootingguide.pdf

     

    John

  • 2. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    John,

     

    Thank you for the input.

     

    As a result of my wait, I was rewarded with the RAID reporting that it is "Normal".

     

    HOWEVER...  I am still stuck at the "Disk Change Notification" screen.

     

    Screenshot:

    RAID Normal.jpg

    Choosing [Scan] or [Continue] appears to do nothing but return me (again) to the Disk Change Notification screen.

     

    I am able to [ShutDown], however on restarting the SS4000-E, the system returns to this same screen.

     

     

    While this apprears to be progress, sadly, the results are worse than before:  At this time, I can access the 200MB Admin directory, but now I can no longer access Public, Public-2 or Public-3.

     

    Those three partitions make up the majority of the 4 @ 2TB drives in RAID 5, and this is a different result than before the reboot, where I could access Public-2 and Public-3.

     

    Q:  Is there any command that can "force" the system to get past this Disk Change Notification screen?

     

     

    What I was able to do was use the suggested command that you provided, turn on debugging and generate the XRAY file.

     

    I have uploaded this file to http://dl.dropbox.com/u/32146825/xray.tgz .  I would appreciate if you could take a look at this, or provide guidance for what key items that I should consider when looking at the output.

     

     

    Finally, turning off debugging did not make any changes to the accessibility of the partitions (not that I thought that it would, but hey, what the heck, right?)

     

     

    As before, all lights on the SS4000-E are green, not indicating any errors.  There appears to be some drive activity, based on sounds/vibration of the unit.  At this point I am planning on letting this run for a day to see if there will be any changes to the partition availability. (not optimistic there either)

     

     

    Again, thank you for your support (and the support of any others that may provide input, of course).

     

    vincent

  • 3. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    Update:  After running the SS4000-E overnight, there were no changes:  Public shared space was still unavailable.

     

    So, to remove the drive set from the equation, I replaced:

     

     

    with:

     

     

    Observations:

     

    1. The Seagate drive is a lower RPM drive,
    2. On placing the Hitachi drives into another manufacturer's NAS solution (that provides more access to SMART data), one of the drives DID show a SMART event.  However, I would think that one SMART event should not lock-up the entire SS4000-E solution, as the drive passed the systems drive test.
    3. On reinitializing and reconfiguring the Seagate Drives into the SS4000-E, I did notice that while the partitions create quickly, on checking advanced / drives I do see that the RAID does not actually get configured (resync'ed), and that it seems to actually be formatting/testing the partitions.

     

    Screenshot showing the Seagate drives, and the beginning of a "resync" process that appears that will take nearly 5 days (!!)

     

    drives.jpg

    Of course, my hope(s) are three-fold:

     

    1. That the preparation / resync of the solution (with no data on the drives) should not take an actual 5 days,
    2. That waiting for this resync to complete BEFORE adding data will allow the systems to better accept the larger capacity.  (Possibly, I had overwhelmed the solution when adding data while the initial resync was taking place?)  And...
    3. That the Seagate drives will behave differently than the Hitachi Drives.

     

    The good news, is that we once again are able to reach the SS4000-E Home screen, and see a RAID 5 configured storage space of  5587.5GB

     

    Screenshot:

     

    Home Image.jpg

     

    I would still be very interested in discovering what the xRay output states for the previous crash (as I would be very interested if it was system limitation related), however I am moving forward with this attempt.

     

    I will continue to document the results for others interested in the outcome of the SS4000-E with 2 TB drives.

     

    Regards,

     

    Vincent

  • 4. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    Update:

     

  • 5. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    The Seagate drives have finished their initialize/resync. (actual time elapsed was closer to 7 days)

     

    RAID status is Normal, Hotplug Indicator shows Yellow for all disks,.

     

    I will begin to restore data to drives today, I will provide updates at various milestones.

     

    Yellow.png

  • 6. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    MAJOR SETBACK.

     

    The plan was to change the NAS name from NAS-4 to NAS-1, as well as change the fixed IP address.

     

    On changing the NAS name, two hard drives (#1 and #4) went offline (drive indicator lights on SS4000-E box off).

     

    Screenshot:

     

    Disck failed on rename.png

     

    Scan did not get the drives to recognize.

     

    Shutdown / reboot required the NAS to be found using the Storage System Console tool.

     

    As you would think, losing two drives broke the RAID.  The system reports the same drives, but that they are "new".

     

    Screenshot:

    Reboot new.png

     

    Looks like it's time to reinitialize the disks (again).

  • 7. Re: SS4000-E :  Questions on pushing the limits
    john_s@intel Community Member
    Currently Being Moderated

    zebraitis,

     

    Changing the storage system name or IP address does not have an effect on the RAID. Why would it? It may cause the system to be inaccessible in a networked environment until the name and/or IP address propagates through a DNS'd network, but won't be the cause of a RAID failure.

     

    I just tried both a name and IP address change on my lab system with no adverse effects. Changed from Storage101 with a RAID 5 configuration to Storage999 and IP address 192.168.101 to 192.168.1.150, rebooted and the storage system was now Storage999, IP address 192.168.1.150 with RAID 5.

     

    During the boot I monitored the boot process and the disk were indicated as active sync and in the operating system storage console the Advanced: Disks Hotplug indicator all yellow. All functions are operating normally.

     

    Storage system software versions 1.0 through 1.4 are a "mildly" proprietary build based on standard Linux Kernel version 2.6. There's no "kill RAID" commands built in for system name or IP address changes.

     

    John

  • 8. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    John,

     

    Thanks for monitoring this journey.

    Changing the storage system name or IP address does not have an effect on the RAID. Why would it? It may cause the system to be inaccessible in a networked environment until the name and/or IP address propagates through a DNS'd network, but won't be the cause of a RAID failure.

     

    You know, logically, I absolutely agree with you, and I expected no surprises.

     

    But yet, there was a negative impact.  Knowing that it was a very unexpected result, that is why I included the screenshot.  Surprised, I even went and looked at the NAS box, and the lights on Drive #1 and #4 were off, just as the screen shot /disk change notification showed.

     

    Other than changing the name, there was no other action taken.

     

    Could it be a fluke?  Sure.  Absolutely could be.

     

    And, possibly, I may have been able to remove/reinstall the two drives and seen if they would again power up.  However, I did skip that step and moved forward with reinitialization, as the second screen shot above did not show a surviving RAID.

     

    Here is the system log, which shows the syncronization completing, and then (seven hours later) a nearly concurrent disk error with my initiating the name change.  (rem: read from bottom up)

     

    failure.png

     

    Sadly, as they say "it is what it is".

     

    I continue on.

  • 9. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    Update:

     

    For those who may read this thread and are curious...  The initialization / resync of the 4 @ 2TB drives in a RAID 5 configuration appears to process at a rate of 20% per day.

     

    While this resync process occurs, the drives show RED, which indicates that the RAID would be broken if a drive failed or was removed.

     

    After two days of processing, here is a screenshot:

     

    RESYNC.png

     

    My intent is to wait until the resync is complete, restart the system, verify that the RAID is valid following that reboot, and then begin the restoration of data.

  • 10. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    SUMMARY:

     

    • Things are not going well, the SS4000-E failed again.
    • Possible defective unit (?)
    • With different drives, the NAS failed in much the same manner.
    • I have two more of these units, I have placed the sets of drives into them to see the outcome

     

     

     

    Here are the details:

     

    The resync of the drives had completed, and things looked good.

     

    The System log showed that the resync completed:

    New Log.png

     

    And, the home page showed the expected status:

    New Home.png

    And, looking at the drives, the NAS appeared fine, and the 4 drives showed Hotplug YELLOW as expected:

    New Yellow.png

     

    And then... For whatever reason, drive #4 went dark, and I had a Disk Change Notification:

    Wierd Drive Failure.png

    This time, I chose to pull Drive #4, and reinsert.  Disk Change Notification told me it was rebuilding...

    Wierd Drive Rebuilding.png

    However, after rebuilding for some time, Drive #4 went dark again.  Here is the System log:

     

    Failed yet again.bmp

    At this point, since the RAID was still valid on the three remaining drives, and I could access the various partitions, I started testing the Seagate drive (using "SeaTools") to check for any errors or issues.  Finding none, I planned on reinserting the drive.

     

    And that is when I found that Drive #1 was also dark.  At that point, that meant that only two drives of the RAID remained, and that was the end of that.

     

     

    At this point, I had found that the SAME box that had two different sets of drives from two different manufacturers had two drives go dark in the same slots:  #1 & #4.

     

    Now, if this takes the drives out of the calculation, that may leave the SS4000-E box itself as a possible problem.

     

     

    SO...

     

    Since I have two additional SS4000-E boxes, I decided to pop them out of the cardboard box, and insert the two sets of drives into those two boxes to see the outcome.

     

    They are both currently in the resync process, at around the 40% mark.

     

    Both boxes came with firmware 1.4 v.709...  So, I chose to let them run with that, and not upgrade the firmware to v710.

     

     

    Sidebar:

     

    One thing that caught my attention was that one of the boxes was MaxData branded.

     

    Same menus and function, but different color scheme.  For the curious, here's a few screenshots:

    Maxdata 1.4 709.png

     

    Maxdata Disk Resync.png

     

     

     

    I'll provide an update once the two boxes finis their resync.  I hope that I will be able to begin data restoration.

     

    v.

  • 11. Re: SS4000-E :  Questions on pushing the limits
    emilec Community Member
    Currently Being Moderated

    If I could play devil's advocate here for a moment. If after all this struggling you do manage to get this working, would you actually trust it with your data? Personally I'd be cutting my losses and moving over to something like an HP MicroServer and NAS4Free (formerly FreeNAS).

     

    The time you've spent on trying to make this work has long overshadowed the initial cost savings IMO.

  • 12. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    emilec wrote:

     

    If I could play devil's advocate here for a moment. If after all this struggling you do manage to get this working, would you actually trust it with your data? Personally I'd be cutting my losses and moving over to something like an HP MicroServer and NAS4Free (formerly FreeNAS).

     

    The time you've spent on trying to make this work has long overshadowed the initial cost savings IMO.

     

    Thanks for the comment.  That's a fair question, and one that has crossed my mind.  I have built my own server based NAS before, however, I like the idea of a stand-alone device.

     

    At this point, I would say that I am committed to finding a solution that works.  If these two SS4000-E's should have any challenge once they complete resync, then it's off to another solution.  However, I will see this through and document my results to find if the 2TB drives work in these boxes or not.

  • 13. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    OPTIMISTIC UPDATE:

     

    The two "newer" S4000-E's, with the two sets of 2TB drives (Hitachi & Seagate), appear to have completed their resync, and Hotplug Indicators show yellow.

     

     

    NAS-1:

    nas1 yellow.png

     

    NAS-4:

    nas4 yellow.png

     

    After rebooting each unit several times with no drive loss(which was the problem with the first tested S4000-E), I am again beginning data restoration to NAS-1.

     

    Updates will be provided after each partition is completed.

  • 14. Re: SS4000-E :  Questions on pushing the limits
    zebraitis Community Member
    Currently Being Moderated

    Minor Info Update:

     

    • NAS-1 continues to restore.

     

    • NAS-4 has been cycled several times with no negative effects, and today I have changed the name of that box from NAS-4 to NAS-5 with no negative effect.

     

    Screenshot of that name change:

    .name change.png

    .

    .

     

    ( Continued on next page...)

1 2 Previous Next

More Like This

  • Retrieving data ...

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points