Hi,
After a reboot of my SS4200-E i got all the led amber, and so i cannot access to my data anymore.
My NAS uses 4 disks in parity mode.
I saw many post about this subject on this forum and tried to do many solutions:
- did a hard reset -> unsuccesful
- I bought a brain new disk and try 4 times the exchange with one of the original disks -> unsuccesful
- I connected with ssh, but the "ps -af" didn't show any "e2fsck" process -> unsuccesful
When i did "mdadm --examine sda1" and the same for sdb1 sdc1 sdd1, i understand that i have at least 2 faulty disk, i think but i'm not sure (actually, i don't understand the result of the commands i used...):
My question is :
Is there a way to recover some data from my degraded Array ?
On my 3TB of data i have only 70Go very important i expect to recover.
i hope the array is not fully unreadable, isn't it ?
Here is the log of "mdadm --examine"
# mdadm --examine sda1 sdb1 sdc1 sdd1
sda1:
Magic : a92b4efc
Version : 00.90.00
UUID : 7a5ebb7c:46fc5931:47ed1e69:69a35450
Creation Time : Tue Sep 23 11:32:57 2008
Raid Level : raid5
Device Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Update Time : Sat Jun 2 12:51:50 2012
State : clean
Active Devices : 2
Working Devices : 3
Failed Devices : 2
Spare Devices : 1
Checksum : eef3e5e6 - correct
Events : 0.52370
Layout : left-asymmetric
Chunk Size : 32K
Number Major Minor RaidDevice State
this 0 254 0 0 active sync /dev/evms/.nodes/sda1
0 0 254 0 0 active sync /dev/evms/.nodes/sda1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 254 3 3 active sync /dev/evms/.nodes/sdb1
4 4 254 1 4 spare /dev/evms/.nodes/sdc1
sdb1:
Magic : a92b4efc
Version : 00.90.00
UUID : 7a5ebb7c:46fc5931:47ed1e69:69a35450
Creation Time : Tue Sep 23 11:32:57 2008
Raid Level : raid5
Device Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Update Time : Sat Jun 2 12:51:50 2012
State : clean
Active Devices : 2
Working Devices : 3
Failed Devices : 2
Spare Devices : 1
Checksum : eef3e5e9 - correct
Events : 0.52370
Layout : left-asymmetric
Chunk Size : 32K
Number Major Minor RaidDevice State
this 4 254 1 4 spare /dev/evms/.nodes/sdc1
0 0 254 0 0 active sync /dev/evms/.nodes/sda1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 254 3 3 active sync /dev/evms/.nodes/sdb1
4 4 254 1 4 spare /dev/evms/.nodes/sdc1
sdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : 7a5ebb7c:46fc5931:47ed1e69:69a35450
Creation Time : Tue Sep 23 11:32:57 2008
Raid Level : raid5
Device Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Update Time : Sat Jun 2 12:51:37 2012
State : active
Active Devices : 3
Working Devices : 4
Failed Devices : 1
Spare Devices : 1
Checksum : eef3193a - correct
Events : 0.52365
Layout : left-asymmetric
Chunk Size : 32K
Number Major Minor RaidDevice State
this 2 254 2 2 active sync /dev/evms/.nodes/sdd1
0 0 254 0 0 active sync /dev/evms/.nodes/sda1
1 1 0 0 1 faulty removed
2 2 254 2 2 active sync /dev/evms/.nodes/sdd1
3 3 254 3 3 active sync /dev/evms/.nodes/sdb1
4 4 254 1 4 spare /dev/evms/.nodes/sdc1
sdd1:
Magic : a92b4efc
Version : 00.90.00
UUID : 7a5ebb7c:46fc5931:47ed1e69:69a35450
Creation Time : Tue Sep 23 11:32:57 2008
Raid Level : raid5
Device Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Update Time : Sat Jun 2 12:51:50 2012
State : clean
Active Devices : 2
Working Devices : 3
Failed Devices : 2
Spare Devices : 1
Checksum : eef3e5ef - correct
Events : 0.52370
Layout : left-asymmetric
Chunk Size : 32K
Number Major Minor RaidDevice State
this 3 254 3 3 active sync /dev/evms/.nodes/sdb1
0 0 254 0 0 active sync /dev/evms/.nodes/sda1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 254 3 3 active sync /dev/evms/.nodes/sdb1
4 4 254 1 4 spare /dev/evms/.nodes/sdc1
up ![]()
kloog,
Your mdadm readout shows State: clean, Active Devices: 2 and Failed Devices: 2. With RAID5, two failed disks is pretty much not recoverable.
You can see if you have access (if the subdirectories exist) at:
/mnt/soho_storage/samba/shares/
On every mount of the file system, the Storage System software does a file system check. This will replay the data in the ext3 logs, correcting any file system issues that may have been caused by an improper shutdown. If this fails a full file system check is performed. The full check will make the file system mountable if possible. It may also recover some user data that would otherwise be lost, and place it in the lost+found directory. The lost+found directory is at /mnt/soho_storage/lost+found. If the full file system check doesn't make the file system mountable, it is likely the user data is permanently lost. However it may be desirable to attempt the file system check manually:
e2fsck –f /dev/evms/md0vol1
If this finishes successfully attempt to mount the file system:
mount /dev/evms/md0vol1 /mnt/soho_storage –t ext3
If this succeeds the file system is now mounted. It’s possible that some user files were lost in the file system corruptions and recovered by the full file system check. They would be placed in /mnt/soho_storage/lost+found. If that is the case the data can be made available to the user by copying it to the public folder:
cp –r /mnt/soho_storage/lost+found /mnt/soho_storage/samba/shares/public
The files in lost+found may not be complete, or may be corrupted, but the user will now have access to some or all of their data. The Storage System device should now be rebooted to restart all services.
Be very careful logged into the system as root, a mistake can make the system unusable.
If the above steps don't make the SS4200 data available, you'll need to recover data from your backup.
Regards,
John
Thanks for your answer John,
Unfortunatly i have nothing in /mnt/soho_storage/ folder :
# cd /mnt/soho_storage
# ls -l
-r-xr-xr-x 1 root root 881 Jun 21 17:14 smb.conf
And it seems that i have not the md0vol1 too :
# e2fsck -f /dev/evms/md0vol1
e2fsck 1.38 (30-Jun-2005)
e2fsck: No such file or directory while trying to open /dev/evms/md0vol1
# pwd
/dev/evms
# ls -l
drwxr-xr-x 2 root root 60 Jun 21 17:13 dm
this is dead ?
Kloog,
It may be possible to recover the array if 3 of the 4 original drives are available and functioning.
With a RAID 5 array the following command can be used to attempt to recover the array:
mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
If that's not successful, use:
mdadm –examine /dev/sdX1 (where X is the specific disk you want to see: a, b, c or d) to determine the disk drive failure. Check if all working disks have identical UUID after issuing this command on all disks.
In the case where 1 out of 4 disks has a different UUID, and others are the same, try following command:
mdadm --assemble --run --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdd1 (assume here sdc1 has the different UUID and sda1, sdb1, sdd1 are same, This command will force rebuild the md0 without the different UUID)
If this is successful the array will begin recovery. Use cat /proc/mdstat to monitor recovery of the array. If it does not succeed because one or more of the drives is bad, the hardware is bad, or any other reason, the user data is irretrievably lost. If it succeeds the user data may still be recovered. Reboot the device. The file system including the user data is still on the array and if not too badly corrupted will be available. The Storage System device will find the file system and check it. If it can be mounted it will be. If not you may try a manual check as described in the previous section.
It’s possible the file system check succeeded and created a lost+found area. If so you want to copy the data as described in the previous section.
Good luck,
John
Thank you John,
I tried, here are the logs below.
i understand that it didn't work, did it ?
# mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm: forcing event count in /dev/sdc1(2) from 52370 upto 52374
mdadm: clearing FAULTY flag for device 2 in /dev/md0 for /dev/sdc1
mdadm: /dev/md0 has been started with 3 drives (out of 4) and 1 spare.
# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
md0 : active raid5 sda1[0] sdb1[4] sdd1[3] sdc1[2]
2930279808 blocks level 5, 32k chunk, algorithm 0 [4/3] [U_UU]
[=============>.......] recovery = 66.3% (648277552/976759936) finish=62.2min speed=87882K/sec
After 2 hours...
unused devices: <none>
# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
md0 : active raid5 sda1[0] sdb1[4](S) sdd1[3] sdc1[5](F)
2930279808 blocks level 5, 32k chunk, algorithm 0 [4/2] [U__U]
Then I reboot
# cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
unused devices: <none>
# pwd
/mnt/soho_storage
# ls
smb.conf
Unfortunately it doesn't look like it kloog. If things were correctly in place, the reboot would have mounted md0 though the normal process.
John
Ok thank you john,
I tried everything i could, now i'm ready to say goodby to my data
by
kloog,
I understand this information is after the fact, but we cannot stress the importance of a backup solution for important data. Computer systems can and do fail. Data should be stored in multiple locations to increase the chance to recover from a failure.
See our web solution on Planning for the worst case: the importance of a backup solution for more information.
Regards,
John

