免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
12下一页
最近访问板块 发新帖
查看: 7448 | 回复: 10
打印 上一主题 下一主题

RAID-5 volume恢复的最佳办法(vertias volume manger) [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2007-10-09 11:39 |只看该作者 |倒序浏览
1.    首先,确定你的硬件错误属于下面的一些问题.
T 可以使用dd命令来读一些物理设备上的数据.比如:读第一个100MB从c1t11d0设备上.
   # dd if=/dev/rdsk/c1t11d0s2 of=/dev/null bs=1024k count=100
.一些磁盘阵列要求你的硬件的cache设置为disable.请先设置这个参数.

2.  当物理磁盘损坏,分配在这个磁盘上的子磁盘会detached.使用vxprint –ht命令显示子磁盘如"NDEV" (No Device)和RAID-5卷的操作为"Degraded Mode".VXVM的log信息会出现在/var/adm/messages中.比如:

   Nov  9 16:05:29 sydvcs4 unix: WARNING: vxvm:vxio: object disk1-01  detached from RAID-5 r5vol at column 0 offset 0
   Nov  9 16:05:29 sydvcs4 unix: WARNING: vxvm:vxio: RAID-5 r5vol entering degraded mode operation


对于ANY WRITE OPERATION来说这个理解是非常重要的,在卷进入"Degraded Mode"的时候,那些被detached的子磁盘上的数据.当物理磁盘被重新加入到系统中的时候写的数据被存储到RAID-5奇偶校验上的.这些数据需要从奇偶校验上恢复到新的子磁盘上.
3. 当detached的子磁盘或者奇偶校验上的数据copy是可以用的,允许设置一个预先的奇偶校验.音问只有当卷是disable的时候奇偶校验才是disable的,使用一个最新的用户数据.一个通常的RAID-5上恢复数据的方法是重新计算detach子磁盘上的奇偶校验,在故障数据上重新写可用数据. This kind of corruption is IRREVERSIBLE

4. RAID-5卷可以通过delayrecover参数来开始.卷管理可以推迟恢复直到执行vxrecover命令.  (The recovery can either be parity recalculation or regenerating the data from parity, both have the potential to overwrite valid data.)

       # vxvol -g <diskgroup> -o delayrecover start <volume>


可以使用 vxstat 命令来查看子磁盘的写操作:

       # vxstat -i 5 -g <diskgroup> -s <subdisk>....

通过运行T  vxprint -ht -g <diskgroup> <vol>.  来获得子磁盘的信息 列出子磁盘信息可以如下命令:

       # vxprint -ht -g <diskgroup> <vol> | awk '/^sd/ {print $2}
/^s2/ {print $2}'

5.  不用强行进行RAID5卷的开始的情况是应为卷管理已经预先设置的detached子磁盘(DET, FAIL or RCOV状态)开始奇偶校验会重新计算重新写奇偶校验.如果在一行中出现了2个或者更多的detached子磁盘.卷管理并不知道这些子磁盘包含up-to-date数据.如果,你没有加delayrecover参数的话, 奇偶校验重新计算并不会开始.同时卷会进入NEEDSYNC状态.如果强行进行卷启动并且没有加delayrecover参数的话, 奇偶校验重新计算会立刻开始.如果奇偶校验包括更多的up-to-date用户数据,着卷将损坏.

6. RAID5的log是在奇偶校验和用户数据之间连续性的一个保证.RAID5的log永远是replay的.如果要求进行奇偶校验确认(vxr5check)则连续性是不能保证的.

7. The VERITAS File System (VxFS) log is crucial for the recovery of an interrupted VxFS file system.  Without replaying the log, file system corruption will happen, especially if the file system is heavily used when the RAID-5 volume goes down.  The catch is that the VxFS log needs to be replayed and WRITTEN to the RAID-5 volume before the full file system check can proceed, but without knowing the integrity of the RAID-5 volume, any write operation may cause corruption to volume data.  In order to solve the dilemma, metasave can be used to get an image of the file system and run fsck again the saved image before running the actual fsck on the RAID-5 volume.  The metasave utility can be obtained from the VERITAS ftp site: ftp://ftp.veritas.com/pub/support/metasave.tar.Z. metasave is OS version dependent, please choose the right version for your OS.  For example, when you run it on Solaris 8, the following command can get a copy of the file system structure to an ordinary file and a fsck can run on the ordinary file.  Remember to replay the log, DO NOT use the "nolog" option.

       # metasave_5.8 /dev/vx/rdsk/<diskgroup>/<volume> | metasave_5.8 -r <VxFS image file>
       # fsck -F vxfs -o full <VxFS image file>




---------------------------------------------------------------------------------
RAID-5 卷中的任何一行, 如果子磁盘的状态为:"NDEV", "FAIL", "DET" or "RCOV", 则子磁盘的数据可以从奇偶校验中恢复.  比如:RAID-5 卷的状态如下:
v  r5vol        -            ENABLED  ACTIVE   24576000 raid5     -        RAID
pl r5vol-01     r5vol        ENABLED  ACTIVE   24577920 RAID      5/32     RW
sd disk1-01     r5vol-01     disk1    0        2049120  0/0       -        NDEV         <<<< 1st ROW with OFFSET 0
sd disk10-01    r5vol-01     disk10   415      2047680  0/2049120 sdc1t1d0s0 ENA     <<<< 2nd ROW with OFFSET 2049120
sd disk15-01    r5vol-01     disk15   415      2047680  0/4096800 sdc1t1d0s5 ENA     <<<< 3rd ROW with OFFSET 4096800

sd disk3-01     r5vol-01     disk3    0        2049120  1/0       c1t3d0   ENA       <<<< 1st ROW with OFFSET 0
sd disk30-01    r5vol-01     disk30   415      2047680  1/2049120 -        NDEV     <<<< 2nd ROW with OFFSET 2049120
sd disk35-01    r5vol-01     disk35   415      2047680  1/4096800 sdc1t3d0s5 ENA  <<<< 3rd ROW with OFFSET 4096800

sd disk4-01     r5vol-01     disk4    0        2049120  2/0       c1t4d0   ENA        <<<< 1st ROW
sd disk40-01    r5vol-01     disk40   415      2047680  2/2049120 sdc1t4d0s0 ENA
sd disk45-01    r5vol-01     disk45   415      2047680  2/4096800 -        NDEV

sd disk5-01     r5vol-01     disk5    0        2049120  3/0       c1t5d0   ENA        <<<< 1st ROW
sd disk50-01    r5vol-01     disk50   415      2047680  3/2049120 sdc1t5d0s0 ENA
sd disk55-01    r5vol-01     disk55   415      2047680  3/4096800 sdc1t5d0s5 ENA

sd disk6-01     r5vol-01     disk6    0        2049120  4/0       c1t6d0   ENA        <<<< 1st ROW
sd disk60-01    r5vol-01     disk60   415      2047680  4/2049120 sdc1t6d0s0 ENA
sd disk65-01    r5vol-01     disk65   415      2047680  4/4096800 sdc1t6d0s5 ENA

pl r5vol-02     r5vol        ENABLED  LOG      5760     CONCAT    -        RW
sd disk56-01    r5vol-02     disk56   415      5760     0         sdc1t5d0s6 ENA

第一行的 disk1-01, disk3-01, disk4-01, disk5-01和disk6-01在disk1-01 上只有一个子磁盘是failed.  disk30-01在第2行只有一个子磁盘是failed  disk45-01 在第3行上只有一个子磁盘是failed.  恢复这个过程只需要简单的reaattach或者替换坏的磁盘,然后,运行vxrecover.

如果卷是停止的则需要start.使用delayrecover参数启动卷.

       # vxvol -g <diskgroup> -o delayrecover <volume>

The end-user should check the data integrity without writing any data to the volume.  In case of VERITAS File System (VxFS), because the intent log need to be replayed, write operation is required.  If you want to be completely safe, get the VxFS image with metasave and run fsck against the image.

If you are confident that the volume data is in good shape, just run a simple fsck on the file system:

       # fsck -F vxfs /dev/vx/rdsk/<diskgroup>/<volume>

If a full file system check is required even AFTER THE LOG IS REPLAYED, run a full fsck with option "-n" to assess the severity of damage:

    # fsck -F vxfs -o full -n /dev/vx/rdsk/<diskgroup>/<volume>


On the other hand, if want to be sure that the volume is recovered properly before replaying the VxFS intent log, you can save an image of the file system to an ordinary file and run a full fsck on that ordinary file.  Please note that a log replay is essential in the recovery of VxFS, the nolog option should not be used unless the intent log is corrupted.

       # metasave_5.8 /dev/vx/rdsk/<diskgroup>/<volume> | metasave_5.8 -r <VxFS image file>
       # fsck -F vxfs -o full <VxFS image file>

If the VxFS image is checked okay, then you can run the fsck on the actual volume.

当校验完数据,通过vxreattach来恢复了detached的子磁盘,或者使用vxdiskadm来替换故障磁盘.如果子磁盘是"RCOV"状态.vxrecover能够恢复子磁盘.
恢复在一行中出现2个或者以上故障子磁盘.
------------------------------------------------------------------------------------------------------------------------------------------------
比如::  

v  r5vol        -            DETACHED ACTIVE   24576000 raid5     -        RAID
pl r5vol-01     r5vol        ENABLED  ACTIVE   24577920 RAID      5/32     RW
sd disk1-01     r5vol-01     disk1    0        2049120  0/0       -        NDEV
sd disk10-01    r5vol-01     disk10   415      2047680  0/2049120 sdc1t1d0s0 ENA
sd disk15-01    r5vol-01     disk15   415      2047680  0/4096800 sdc1t1d0s5 ENA
sd disk3-01     r5vol-01     disk3    0        2049120  1/0       c1t3d0   ENA
sd disk30-01    r5vol-01     disk30   415      2047680  1/2049120 -        NDEV
sd disk35-01    r5vol-01     disk35   415      2047680  1/4096800 sdc1t3d0s5 ENA
sd disk4-01     r5vol-01     disk4    0        2049120  2/0       c1t4d0   ENA
sd disk40-01    r5vol-01     disk40   415      2047680  2/2049120 sdc1t4d0s0 ENA
sd disk45-01    r5vol-01     disk45   415      2047680  2/4096800 -        NDEV
sd disk5-01     r5vol-01     disk5    0        2049120  3/0       c1t5d0   ENA
sd disk50-01    r5vol-01     disk50   415      2047680  3/2049120 sdc1t5d0s0 ENA
sd disk55-01    r5vol-01     disk55   415      2047680  3/4096800 sdc1t5d0s5 ENA
sd disk6-01     r5vol-01     disk6    0        2049120  4/0       -        NDEV
sd disk60-01    r5vol-01     disk60   415      2047680  4/2049120 -        NDEV
sd disk65-01    r5vol-01     disk65   415      2047680  4/4096800 -        NDEV
pl r5vol-02     r5vol        ENABLED  LOG      5760     CONCAT    -        RW
sd disk56-01    r5vol-02     disk56   415      5760     0         sdc1t5d0s6 ENA

在 /var/adm/messages 文件中有以下信息:, t
Nov  9 16:05:29 sydvcs4 unix: WARNING: vxvm:vxio: object disk1-01 detached from RAID-5 r5vol at column 0 offset 0
Nov  9 16:05:29 sydvcs4 unix: WARNING: vxvm:vxio: RAID-5 r5vol entering degraded mode operation

Nov  9 16:07:30 sydvcs4 unix: WARNING: vxvm:vxio: object disk30-01 detached from RAID-5 r5vol at column 1 offset 2049120
Nov  9 16:07:30 sydvcs4 unix: WARNING: vxvm:vxio: RAID-5 r5vol entering degraded mode operation

Nov  9 16:07:30 sydvcs4 unix: WARNING: vxvm:vxio: object disk45-01 detached from RAID-5 r5vol at column 1 offset 2049120
Nov  9 16:07:30 sydvcs4 unix: WARNING: vxvm:vxio: RAID-5 r5vol entering degraded mode operation

(24 minutes later)

Nov  9 16:31:29 sydvcs4 unix: WARNING: vxvm:vxio: Double failure condition detected on RAID-5 r5vol
Nov  9 16:31:29 sydvcs4 unix: WARNING: vxvm:vxio: detaching RAID-5 r5vol

Another place to find out the failure sequence is in those emails sent by the vxnotify process.  By default the mails go to root email box.  Look for the following emails.  For example, the following messages were sent to root because of the disk failures:


-------- Begin of root mails ----------

Message 63:
From root Sat Nov  9 16:05:45 2002
Date: Sat, 9 Nov 2002 16:05:45 +1100 (EST)
From: Super-User <root>
To: root
Subject: Volume Manager failures on host sydvcs4

Failures have been detected by the VERITAS Volume Manager:

failed disks:
disk1

failing disks:
disk1

Message 65:
From root Sat Nov  9 16:07:47 2002
Date: Sat, 9 Nov 2002 16:07:47 +1100 (EST)
From: Super-User <root>
To: root
Subject: Volume Manager failures on host sydvcs4

Failures have been detected by the VERITAS Volume Manager:

failed disks:
disk1
disk30

failing disks:
disk1
disk30


Message 68:
From root Sat Nov  9 16:13:27 2002
Date: Sat, 9 Nov 2002 16:13:27 +1100 (EST)
From: Super-User <root>
To: root
Subject: Volume Manager failures on host sydvcs4

Failures have been detected by the VERITAS Volume Manager:

failed disks:
disk1
disk30
disk45

failing disks:
disk1
disk30


(24 minutes later)

Message  4:
From root Sat Nov  9 16:37:04 2002
Date: Sat, 9 Nov 2002 16:37:04 +1100 (EST)
From: Super-User <root>
To: root
Subject: Volume Manager failures on host sydvcs4

Failures have been detected by the VERITAS Volume Manager:

failed disks:
disk1
disk30
disk45
disk60
disk65

failing disks:
disk1
disk30

.......

---------- End of root emails --------

From the above emails, it can be seen that disk1, disk30 and disk45 failed before disk6, disk60 and disk65.  

According to the /var/adm/messages logs and vxnotify emails, it is apparent that there were 24 minutes lapsed from the point the volume entered "degraded mode" to the point the volume was disabled (detached) due to double failure.  It can be safely assumed that subdisks disk1-01, disk10-01 and disk45-01 were obsoleted by the write operations which occurred during those 24 minutes.  The recovery procedures should be like the following.

1. 保存磁盘组的配置:

       # vxdisk -o alldgs list
       # vxprint -rht
       # vxprint -Am

2.  分离故障子磁盘确保你备份了原始的配置,以便恢复时使用)

       # vxsd -g <diskgroup> dis <subdisk>.....
  列如从 /var/adm/messages文件中来的信息.
       # vxsd -g <diskgroup> dis disk1-01 disk30-01 disk45-01

3. Unstale子磁盘的可用数据:

       # vxmend -g <diskgroup> unstale <subdisk>...
   以上列来看:
       # vxmend -g <diskgroup> fix unstale disk6-01 disk60-01 disk65-01

4. 通过delayrecover参数启动卷:

      # vxvol -g <diskgroup> -o delayrecover start <volume>

5. 要求用户进行数据检验
6. If the data is verified by the end-user, the obsolete subdisks can be associated back.  You will need the original configuration (vxprint -ht) in order to obtain the correct position to put the subdisks back.  The <column>/<offset> values can be found in the 7th column of the vxprint -ht output.

       # vxsd -g <diskgroup> -l <colume>/<offset> assoc <plex> <subdisk>
   For the above example, associate back disk1-01, disk30-01 and disk45-01 according to the saved vxprint -ht output.

       # vxsd -g <diskgroup> -l 0/0 assoc r5vol-01 disk1-01
       # vxsd -g <diskgroup> -l 1/2049120 r5vol-01 disk30-01
       # vxsd -g <diskgroup> -l 2/4096800 r5vol-01 disk45-01

  The association of a subdisk should automatically start the recovery of data to the subdisks from the parity.  Check the tasks with the vxtask list command.


Recovery with two or more failed subdisks in each row and the sequence of failure can not be determined
--------------------------------------------------------------------------------------------------------------------------------------------------
If for whatever reason the sequence of failure of subdisks can not be determined, the RAID-5 needs to be recovered by trial and error.  Please keep a detailed log on what you have done because it will be a very tedious task.

It is important to remember that once a subdisk got overwritten, the data is gone forever.  The same is true for the parity, once the parity is recalculated, the data stored there during degraded mode is gone forever.  Never force the volume to start with "-f" or -o force" option and always start the volume in the delayrecover mode.

Always be careful not to enable (unstale) all the subdisks in the same row.  If all the subdisks are enabled manually on the same row, the parity may become inconsistent with the data, this will lead to data corruption later on.

It is NOT correct just to fix the plex to empty by running vxmend and starting the volume, because then the parity will be recalculated and whatever data stored in the parity during degraded mode will be wiped out.

The procedure to recover the volume in this kind of situation is listed below.

1. Save a copy of the disk group configuration:

       # vxdisk -o alldgs list
       # vxprint -rht
       # vxprint -Am

2. Try to find out as much as possible from the /var/adm/messages and vxnotify email on what subdisks are obsolete, then dissociate them from the plex.  

       # vxsd -g <diskgroup> dis <subdisk>

   Take the following volume as an example. If it is known that disk1-01 is obsolete, disk1-01 should be dissociated. Make sure you have the vxprint -ht output before detaching the subdisk, because later the subdisk has to be put back to the original location.

       # vxsd -g <diskgroup> dis disk1-01

and unstale the disk6-01 which is in the same row:

       # vxmend -g <diskgroup> fix unstale disk6-01

v  r5vol        -            DETACHED ACTIVE   24576000 raid5     -        RAID
pl r5vol-01     r5vol        ENABLED  ACTIVE   24577920 RAID      5/32     RW
sd disk1-01     r5vol-01     disk1    0        2049120  0/0       -        NDEV    <<<< first row
sd disk10-01    r5vol-01     disk10   415      2047680  0/2049120 sdc1t1d0s0 ENA
sd disk15-01    r5vol-01     disk15   415      2047680  0/4096800 sdc1t1d0s5 ENA
sd disk3-01     r5vol-01     disk3    0        2049120  1/0       c1t3d0   ENA
sd disk30-01    r5vol-01     disk30   415      2047680  1/2049120 -        NDEV   <<< second row
sd disk35-01    r5vol-01     disk35   415      2047680  1/4096800 sdc1t3d0s5 ENA
sd disk4-01     r5vol-01     disk4    0        2049120  2/0       c1t4d0   ENA
sd disk40-01    r5vol-01     disk40   415      2047680  2/2049120 sdc1t4d0s0 ENA
sd disk45-01    r5vol-01     disk45   415      2047680  2/4096800 -        NDEV   <<< third row
sd disk5-01     r5vol-01     disk5    0        2049120  3/0       c1t5d0   ENA
sd disk50-01    r5vol-01     disk50   415      2047680  3/2049120 sdc1t5d0s0 ENA
sd disk55-01    r5vol-01     disk55   415      2047680  3/4096800 sdc1t5d0s5 NDEV  <<< third row
sd disk6-01     r5vol-01     disk6    0        2049120  4/0       -        NDEV   <<< first row
sd disk60-01    r5vol-01     disk60   415      2047680  4/2049120 -        NDEV    <<< second row
sd disk65-01    r5vol-01     disk65   415      2047680  4/4096800 -        NDEV    <<< third row
pl r5vol-02     r5vol        ENABLED  LOG      5760     CONCAT    -        RW
sd disk56-01    r5vol-02     disk56   415      5760     0         sdc1t5d0s6 ENA


3. For the remaining two rows, dissociate one subdisk from each row at a time, start the volume with the delayrecover option and verify the data. This has to be done  as many time as required until all the valid subdisks can be located.  

If there are N failed subdisks in first row, M failed subdisks in second row and P failed subdisks in third row, it may be necessary to try M x N x P times before all the valid data can be found.

For the above example, after taking out disk1-01 from the first row, it is clear that disk6-01 has the valid data.  For the second row, disk30-01 and disk60-01 should be taken out, and for the third row, disk45-01, disk55-01 and disk65-01.  Disks in all the following 6 combinations should be taken out.
     disk30-01    disk45-01
     disk30-01    disk55-01
     disk30-01    disk65-01
     disk60-01    disk45-01
     disk60-01    disk55-01
     disk60-01    disk65-01

For the first attempt, dissociate disk30-01 and disk45-01:

       # vxsd -g <diskgroup> dis disk30-01 disk45-01

Then enable (unstale) the other subdisks and start the volume with the delayrecover option:

       # vxmend -g <diskgroup> fix unstale disk60-01 disk55-01 disk65-01
       # vxvol -o delayrecover -g <diskgroup> start <volume>

Then verify data using methods that will not write any data to the volume.  In the case of VxFS, use metasave to take an image and run fsck on the image.

       # metasave_5.8 /dev/vx/rdsk/<diskgroup>/<volume> | metasave_5.8 -r <VxFS image file>
       # fsck -F vxfs -o full <VxFS image file>

If fsck returns without error, then fsck can be run on the actual volume.

If fsck returns lots of errors, it means there are obsolete subdisks, the next set of subdisks needs to be taken out, the previous dissociated subdisks need to be put back. It is safer to dissociate the next set of subdisks before associating the original set of subdisks because,  if the volume is started and a subdisk is associated, Volume Manager will start the recovery right away and overwrite the newly associated subdisks.

VERY IMPORTANT: Remember not to associate any subdisk when the volume is started.  By associating a subdisk when the volume is started, Volume Manager will recover the data to the subdisk and overwrite the subdisk.  Always stop the volume before making any changes.

       # vxvol -g <diskgroup> stop <volume>
       # vxsd -g <diskgroup> dis <next combination of subdisks>
       # vxsd -g <diskgroup> -l <column>/<offset> assoc <plex> <subdisk>
       # vxmend -g <diskgroup> fix unstale <subdisk>

For the above example, disk30-01 and disk55-01 are the next combination to be dissociated:

       # vxvol -g <diskgroup> stop r5vol
       # vxsd -g <diskgroup> dis disk55-01       # disk30-01 has already been dissociated
       # vxsd -g <diskgroup> -l 2/4096800 assoc r5vol-01 disk45-01
       # vxmend -g <diskgroup> fix unstale disk45-01

Then start the volume again and ask end-user to verify the data.  In case of VxFS, get the VxFS image and fsck the VxFS image:

       # vxvol -g <diskgroup> -o delayrecover start <volume>
       # metasave_5.8 /dev/vx/rdsk/<diskgroup>/<volume> | metasave_5.8 -r <VxFS image file>
       # fsck -F vxfs -o full <VxFS image file>

Repeat the above procedure for all the combinations of subdisks until the end-user verifies the data to be correct.

Once the end-user verifies the data to be correct, the rest of the subdisks can be recovered by associating them back after the volume is started:

       # vxsd -g <diskgroup> -l <column>/<offset> assoc <plex> <subdisk>


Raid5 logs failed while being zeroed / Plex contains unusable subdisk
--------------------------------------------------------------------------------------------------------
If the device for the log plex is not accessible, the RAID-5 volume can not be started even if it is forced to.

v  r5vol        -            DISABLED CLEAN    24576000 raid5     -        RAID
pl r5vol-01     r5vol        DISABLED ACTIVE   24577920 RAID      5/32     RW
sd disk1-01     r5vol-01     disk1    0        2049120  0/0       c1t11d0  RCOV
sd disk10-01    r5vol-01     disk10   415      2047680  0/2049120 -        NDEV
sd disk15-01    r5vol-01     disk15   415      2047680  0/4096800 sdc1t11d0s5 RCOV
sd disk3-01     r5vol-01     disk3    0        2049120  1/0       c1t13d0  ENA
sd disk30-01    r5vol-01     disk30   415      2047680  1/2049120 sdc1t13d0s0 ENA
sd disk35-01    r5vol-01     disk35   415      2047680  1/4096800 sdc1t13d0s5 ENA
sd disk4-01     r5vol-01     disk4    0        2049120  2/0       c1t14d0  ENA
sd disk40-01    r5vol-01     disk40   415      2047680  2/2049120 sdc1t14d0s0 ENA
sd disk45-01    r5vol-01     disk45   415      2047680  2/4096800 sdc1t14d0s5 ENA
sd disk5-01     r5vol-01     disk5    0        2049120  3/0       c1t15d0  ENA
sd disk50-01    r5vol-01     disk50   415      2047680  3/2049120 sdc1t15d0s0 ENA
sd disk55-01    r5vol-01     disk55   415      2047680  3/4096800 sdc1t15d0s5 ENA
sd disk6-01     r5vol-01     disk6    0        2049120  4/0       c1t6d0   ENA
sd disk60-01    r5vol-01     disk60   415      2047680  4/2049120 sdc1t6d0s0 ENA
sd disk65-01    r5vol-01     disk65   415      2047680  4/4096800 sdc1t6d0s5 ENA
pl r5vol-02     r5vol        DISABLED BADLOG   5760     CONCAT    -        WO
sd disk56-01    r5vol-02     disk56   415      5760     0         sdc1t15d0s6 ENA         <<<< Device sdc1t15d0s6 is not accessible

You can not even force this volume to start if the LOG device fails:

sydvcs4# vxvol -f -o delayrecover start r5vol           
vxvm:vxvol: ERROR: Volume r5vol: Failed to zero logs offset 0 len 248
vxvm:vxvol: ERROR: No such file or directory
vxvm:vxvol: ERROR: Volume r5vol: Raid5 log(s) failed while being zeroed . The logs should be replaced before starting the volume

Another example is that the subdisk in the log plex is not mapped to physical device:

v  r5vol        -            DISABLED ACTIVE   2457600  raid5     -        RAID
pl r5vol-01     r5vol        DISABLED ACTIVE   2459520  RAID      5/32     RW
sd disk6-01     r5vol-01     disk6    0        204960   0/0       c1t6d0   FAIL
sd disk60-01    r5vol-01     disk60   927      204960   0/204960  sdc1t6d0s0 FAIL
sd disk65-01    r5vol-01     disk65   927      204960   0/409920  sdc1t6d0s5 FAIL
sd disk1-01     r5vol-01     disk1    0        204960   1/0       c1t11d0  ENA
sd disk10-01    r5vol-01     disk10   927      204960   1/204960  sdc1t11d0s0 ENA
sd disk15-01    r5vol-01     disk15   927      204960   1/409920  sdc1t11d0s5 ENA
sd disk3-01     r5vol-01     disk3    0        204960   2/0       c1t13d0  ENA
sd disk30-01    r5vol-01     disk30   927      204960   2/204960  sdc1t13d0s0 ENA
sd disk35-01    r5vol-01     disk35   927      204960   2/409920  sdc1t13d0s5 ENA
sd disk4-01     r5vol-01     disk4    0        204960   3/0       c1t14d0  ENA
sd disk40-01    r5vol-01     disk40   927      204960   3/204960  sdc1t14d0s0 ENA
sd disk45-01    r5vol-01     disk45   927      204960   3/409920  sdc1t14d0s5 ENA
sd disk5-01     r5vol-01     disk5    0        204960   4/0       c1t15d0  ENA
sd disk50-01    r5vol-01     disk50   927      204960   4/204960  sdc1t15d0s0 RCOV
sd disk55-01    r5vol-01     disk55   927      204960   4/409920  sdc1t15d0s5 ENA
pl r5vol-02     r5vol        DISABLED NODEVICE 5856     CONCAT    -        RW        <<< Plex is marked as NODEVICE
sd disk56-01    r5vol-02     disk56   927      5856     0         -        NDEV             <<< Subdisk is not mapped into a device

# vxvol -f -o delayrecover start r5vol
vxvm:vxvol: ERROR: changing plex r5vol-02: Plex contains unusable subdisk

Try to recover the device before starting the volume. The log needs to be zeroed out before the volume can be started.

If you need to access the data urgently and you can not fix the log disk, you will need to dissociate the log plex and the volume will need to be forced  to start

(翻译了大部分,没有翻译的部分基本上大概原理是一样的.就没有在额外的重复翻译了.如果,需要看英文原文的朋友,,可以查看原文出处http://support.veritas.com/docs/251793 .有很多的问题,其实,在这个文章中并没有提到,希望有case的朋友,提供补充.作为一个资料库.在这里先谢谢给予帮助的朋友)

论坛徽章:
2
双鱼座
日期:2014-02-23 12:10:03操作系统版块每日发帖之星
日期:2015-12-17 06:20:00
2 [报告]
发表于 2007-10-09 13:29 |只看该作者
赶紧收藏先

论坛徽章:
0
3 [报告]
发表于 2007-10-09 14:31 |只看该作者
风版又来狂捞精华...

论坛徽章:
0
4 [报告]
发表于 2007-10-09 14:55 |只看该作者
原帖由 yuhuohu 于 2007-10-9 14:31 发表
风版又来狂捞精华...


这2天有天时间就翻译一些东西,写一些.火狐兄写的也很多呀.呵呵.

论坛徽章:
0
5 [报告]
发表于 2007-10-09 16:46 |只看该作者
原帖由 风之幻想 于 2007-10-9 14:55 发表


这2天有天时间就翻译一些东西,写一些.火狐兄写的也很多呀.呵呵.


风斑姐姐,能否写一些SVM下的RAID5恢复的最佳办法

论坛徽章:
0
6 [报告]
发表于 2007-10-09 17:00 |只看该作者
原帖由 robina 于 2007-10-9 16:46 发表


风斑姐姐,能否写一些SVM下的RAID5恢复的最佳办法


solaris的卷管理我也是一个新手.也就是会一些基本的.

论坛徽章:
0
7 [报告]
发表于 2007-10-10 09:07 |只看该作者
多多学习不是坏事,呵呵呵

论坛徽章:
2
双鱼座
日期:2014-02-23 12:10:03操作系统版块每日发帖之星
日期:2015-12-17 06:20:00
8 [报告]
发表于 2007-10-10 10:35 |只看该作者
原帖由 robina 于 2007-10-9 16:46 发表


风斑姐姐,能否写一些SVM下的RAID5恢复的最佳办法

就一条命令

论坛徽章:
0
9 [报告]
发表于 2007-10-10 14:36 |只看该作者
原帖由 东方蜘蛛 于 2007-10-10 10:35 发表

就一条命令


  什么命令,大仙,需要悬赏不?  

论坛徽章:
0
10 [报告]
发表于 2007-10-10 16:54 |只看该作者
原帖由 robina 于 2007-10-10 14:36 发表


  什么命令,大仙,需要悬赏不?  


metareplace
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP