- 论坛徽章:
- 0
|
8.1 Recovery from a multiple disk failure
The scenario is:
A controller dies and takes two disks offline at the same time,
All disks on one scsi bus can no longer be reached if a disk dies,
A cable comes loose...
In short: quite often you get a temporary failure of several disks at once; afterwards the RAID superblocks are out of sync and you can no longer init your RAID array.
If using mdadm, you could first try to run:
mdadm --assemble --force
If not, there's one thing left: rewrite the RAID superblocks by mkraid --force
To get this to work, you'll need to have an up to date /etc/raidtab - if it doesn't EXACTLY match devices and ordering of the original disks this will not work as expected, but will most likely completely obliterate whatever data you used to have on your disks.
Look at the sylog produced by trying to start the array, you'll see the event count for each superblock; usually it's best to leave out the disk with the lowest event count, i.e the oldest one.
If you mkraid without failed-disk, the recovery thread will kick in immediately and start rebuilding the parity blocks - not necessarily what you want at that moment.
With failed-disk you can specify exactly which disks you want to be active and perhaps try different combinations for best results. BTW, only mount the filesystem read-only while trying this out... This has been successfully used by at least two guys I've been in contact with.
我总觉得通过重写superblock是一种办法,但不知道如何实施,大家觉得此方法行得通吗? |
|