1 2 Previous Next 21 Replies Latest reply on Apr 23, 2012 9:33 AM by john_s@intel Go to original post
      • 15. Re: SS4000-E :  Questions on pushing the limits

        Awaiting the results of your odyssey as I am very curious. I just started to investigate the capabilities & limits of the SS4000-E in anticipation of making a purchase of the device.

        • 16. Re: SS4000-E :  Questions on pushing the limits

          So...  Weird thing that happened today...



          Yesterday, I attempted to stress the solution by restoring two partitions worth of info concurrently.


          My intent was to write to NAS-1 "public" as well as NAS-1 "public-3" at the same time.


          Due to my own error, I actually wrote both restores to "public"  (... into separate subdirectories... so should have been no big deal).



          I stopped the restore of the data that should have gone to public-3 when I saw my mistake (a few hours later) and I proceeded to copy that data from "public" to "public-3"


          Data copy was  S L O W ... much slower than the initial network write.


          So, after about 30 minutes, I decided to delete that misplaced data... since I knew I had a backup anyway.



          Now, TODAY is when the REALLY strange behavior begins...


          On trying to access public-3 today, Win7 said that the network resource was not available, or that I had insufficient rights.



          Looking at the "home page", I saw that the NAS was ... confused.


          Apparently one partition (the smallest, public-3) was likely gone, and that another partition seemed to think that it was for backups....  And that the overall status was "NOT READY"


          1 Home is whacked.png


          This was troubling, as I had seen this type of behavior on the first box that I had attempted to test.


          At that point, I tried a reboot of the box, but on restart, the status was the same.



          Checking the partitions, I saw that "public-3" had decided that it was "0" MB in size:

          2 No Public 3 (1).png

          However, the disks in the RAID still were good... and hotplug indicator was YELLOW (which is good)

          3 Disk Raid Yellow.png


          Thinking that the NAS could be doing something in the background, I checked the System Status page... and the CPU was idle.

          4 System Staus Idle.png

          And... the system log showed no errors.


          So:  I'll give the S4000-E the benefit of the doubt on this one...



          I deleted partitions "public-2" and "public-3", recreated them and assigned rights, and continue on with the restoration of the "public" partition.


          Now the home page shows what we expect to be normal:

          New Bitmap Image.png


          Bottom line:  Working, and still restoring, but just more than a little disconcerting with that disappearing partition.

          • 17. Re: SS4000-E :  Questions on pushing the limits


            Informational Update:


            Completed restoring NAS-1/public with 1.88TB of data on a 1.99TB partition.





            All appears well.  Now begining restore of NAS-1/public-2

            • 18. Re: SS4000-E :  Questions on pushing the limits

              OK...  Now it looks like we are getting to a potential problem that appears to be replicated.



              Reminder:  My S4000-E has 4 2TB drives installed in a RAID 5 config, resulting in 5.5 TB (5587 GB) available space.  This requires the creation of three partitions:  public (created by default and my choice to size to 2048 GB) , public-2 (created by choice at 2048 GB), and public-3 (created by choice with the remaining space)



              So far, NAS-1/public has restored well.


              I paused restoring NAS-1/public-2 and decided to let NAS-1/public-3 restore for a while.


              That is when the problem became evident.


              When I began to restore NAS-1/public-3, I saw that the speed of transfer was extremely slow.


              Here is what that looked like:


              Slow write to public-3.png


              The picture ablove shows a very slow transfer, when compare that with a screenshot of NAS-1/public-2 showing the expected transfer speed below...


              What speed should be like.png


              Also disturbing was finding that the system log no longer had a complete record, but appeared to start over:


              Hints about problem 1.png


              Knowing that the running restore of NAS-1/public-3 may take weeks at the speed displayed, I attempted to abort the restore, finally pulling the ethernet cable out to cause a loss of network resource.


              Having stopped the restore, I reconnected the ethernet cable to the NAS.   All partitions were still there, as well as all physical drives still indicating YELLOW in the RAID configuration.



              On reboot of the drive, The home screen showed a screen that we have seen before:



              A partition that thinks that it is shared, a partition that thinks that it is a backup, and a partition that is GONE.


              (... and, yes, "public-3" again was at 0 bytes.)



              In this case, I was able to again delete NAS-1/public-3, and then public and public-2 "came back" and the system again was ready.


              The good news is that the physical drives remain in teh raid, and teh raid remain valid.



              Again, I continued with the previously interrupted restore of NAS-1/public-2, with apparently no problem.



              Here is the "new" home window, showing the space occupied by public and public-2:


              a new home.png




              • Why do I appear to have such a problem with the creation of "public-3", the partition that contains the remainder of the available drive space (approx. 1.5TB) ?


              • What affects that partition so that write speeds are so low?


              • Are there any suggestions of the partition size for that third partition?  Does that even matter?


              • Are there any limitations on the S4000-E firmware that affects the creation of that partition?    Is there a problem in exceeding 4096 (2048x2) total partitiond space?




              I await your input.


              While waiting, I continue on with the restoration of Nas-1/public-2



              • 19. Re: SS4000-E :  Questions on pushing the limits

                Sadly, the S4000-E has again suffered a critical error.



                Unless The support folk at Intel can provide a reason why, this may be the end of the pursuit of having the S4000-E work w/ 2 TB drives.


                System XRAYoutput file available at http://dl.dropbox.com/u/23866842/Apr_22_xray.tgz





                NAS-1/public-2 was restoring, and continued to restore.


                However there was a Windows message informing me that NAS-1/public was no longer available as a resource.


                On checking windows network resources, I could access the data on NAS-1/public-2, but I could not access NAS-1/public.


                On checking the S4000-E, Drive #1 was dark.


                On logging in, there was a Disk Change Notification message, stating that Drive #1 was no longer active, and the raid was degraded (3 of 4 drives functional).


                Removing and reinserting the drive begins the rebuild process, however NAS-1/public remains unavailable.





                While I did not initiate any change, here is the Disk Change Notification:


                1 Disk Chage Message.png


                After removing the drive and reinserting, I received a rebuilding update:


                2 rebuilding.png

                If the time message is correct, then rebuld process of that drive will take over 4 days.


                However, the rebuild process may not be of much value...  as after reinserting the drive there is still no access to NAS-1/public:


                3 No more public no scan no continue.png


                ("admin" and "public-2" can be accessed.)



                After reinserting the drive, selecting  [ Continue ]  on the Disk Change Notification screen would NOT allow me to proceed to the Home screen, so I could not tell if the "public" partition was still there.


                I was able to run the Intel XRAY diagnostic program built in on the S4000-E.  The output is available for anyone to view at:  http://dl.dropbox.com/u/23866842/Apr_22_xray.tgz


                While unfamiliar with all the info that could be reviewed in this data, checking the MESSAGES  file, I found the record of the disk being shut down by the S4000-E.   This looks like the system failing and the system choosing to shut down the drive, rather than a physical drive failure:


                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.err kernel: drivers/scsi/gd31244/drv/gd31244_lld.c#2201:gd31244_device_reset: Dev Reset 0:0:0:0, dev# 0: Success
                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.err kernel: drivers/scsi/gd31244/drv/gd31244_lld.c#2218:gd31244_device_reset: reconfigure device #0 failed
                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.err kernel: drivers/scsi/gd31244/drv/gd31244_lld.c#2112:gd31244_bus_reset: Bus Reset called for Ho:Ch:Tgt:Lun (0:0:0:0)
                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.err kernel: drivers/scsi/gd31244/drv/gd31244_lld.c#2201:gd31244_device_reset: Dev Reset 0:0:0:0, dev# 0: Success
                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.err kernel: drivers/scsi/gd31244/drv/gd31244_lld.c#2218:gd31244_device_reset: reconfigure device #0 failed
                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.info kernel: scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0
                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.warn kernel: SCSI error : <0 0 0 0> return code = 0x10000
                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.warn kernel: end_request: I/O error, dev sda, sector 3907029008
                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.err kernel: scsi0 (0:0): rejecting I/O to offline device
                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.alert kernel: raid1: Disk failure on sda1, disabling device.
                Apr 22 06:44:22 ZEBRAITIS-NAS-1 user.warn kernel: ^IOperation continuing on 3 devices



                Strictly out of curiosity, I will be letting the rebuild continue, just to see what will happen and to see the status of the "public" partition.




                Admittedly, I have spent nearly a month working on this, and considering these issues continue on more than one box...  Well, I'm nearly at the end.


                I would very much like the Intel support team to look at the XRAY output and let me know what's going on.

                • 20. Re: SS4000-E :  Questions on pushing the limits

                  RAID FAILURE


                  Turning off the S4000-E and restarting resulted in the Failure of the RAID.


                  Even though three drives remained, and the RAID and data should have been secure, it completely failed requiring an initialization of the drives to continue in any manner.



                  At this point, there is no sense in making any other posts unless the Intel support team provides guidance.

                  • 21. Re: SS4000-E :  Questions on pushing the limits



                    I appreciate your detail for investigating this. However, we know that the 1.4 firmware for the SS4000 was created to allow for support for drives greater than 500 GB. The SS4000 was officially discontinued July 1, 2008. The last Tested hardware and operating system list was published February 2008 and contained one "officially" tested 1TB HDD. We don't know from a validation standpoint how drives that are not tested will function. If a customer chooses to use non-validated components, the operational testing becomes their responsibility. It looks like you've performed more than enough testing to determine that the 2TB HDDs you're using may not be reliable enough.


                    Again, thanks for your effort and I wish the results would have been more favorable.



                    1 2 Previous Next