kwtip 发表于 2014-05-20 11:32

SVM卷报Last Erred修不了急

卷状态
d30: Mirror
    Submirror 0: d31
      State: Needs maintenance
    Submirror 1: d32
      State: Needs maintenance
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 518700000 blocks (247 GB)

d31: Submirror of d30
    State: Needs maintenance
    Invoke: after replacing "Maintenance" components:
                metareplace d30 c0t1d0s0 <new device>
    Size: 518700000 blocks (247 GB)
    Stripe 0:
      Device   Start BlockDbase      State Reloc Hot Spare
      c0t1d0s0          0   No      Last Erred   Yes


d32: Submirror of d30
    State: Needs maintenance
    Invoke: metasync d30
    Size: 518700000 blocks (247 GB)
    Stripe 0:
      Device   Start BlockDbase      State Reloc Hot Spare
      c1t1d0s0          0   No            Okay   Yes

format修复
(root) # format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c0t0d0 <SUN300G cyl 46873 alt 2 hd 20 sec 625>solaris
          /pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0
       1. c0t1d0 <SUN300G cyl 46873 alt 2 hd 20 sec 625>solaris
          /pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@1,0
       2. c1t0d0 <SUN300G cyl 46873 alt 2 hd 20 sec 625>solaris
          /pci@10,600000/pci@0/pci@8/pci@0/scsi@1/sd@0,0
       3. c1t1d0 <SUN300G cyl 46873 alt 2 hd 20 sec 625>solaris
          /pci@10,600000/pci@0/pci@8/pci@0/scsi@1/sd@1,0
       4. c9t60080E50002FD908000004B2521931BDd0 <SUN-LCSM100_F-0784-2.18TB>
          /scsi_vhci/ssd@g60080e50002fd908000004b2521931bd
       5. c9t60080E50002FD908000004B6521931EFd0 <SUN-LCSM100_F-0784-2.18TB>
          /scsi_vhci/ssd@g60080e50002fd908000004b6521931ef
       6. c9t60080E50002FD908000004BA52193228d0 <SUN-LCSM100_F-0784-2.18TB>
          /scsi_vhci/ssd@g60080e50002fd908000004ba52193228
       7. c9t60080E50002FD908000004BE5219325Dd0 <SUN-LCSM100_F-0784-2.18TB>
          /scsi_vhci/ssd@g60080e50002fd908000004be5219325d
Specify disk (enter its number): 1
selecting c0t1d0: solaris

/dev/dsk/c0t1d0s0 is part of SVM volume stripe:d31. Please see metaclear(1M).
/dev/dsk/c0t1d0s1 is part of SVM volume stripe:d36. Please see metaclear(1M).
/dev/dsk/c0t1d0s7 contains an SVM mdb. Please see metadb(1M).


FORMAT MENU:
      disk       - select a disk
      type       - select (define) a disk type
      partition- select (define) a partition table
      current    - describe the current disk
      format   - format and analyze the disk
      repair   - repair a defective sector
      label      - write label to the disk
      analyze    - surface analysis
      defect   - defect list management
      backup   - search for backup labels
      verify   - read and display labels
      save       - save new disk/partition definitions
      inquiry    - show vendor, product and revision
      volname    - set 8-character volume name
      !<cmd>   - execute <cmd>, then return
      quit
format> analyze


ANALYZE MENU:
      read   - read only test   (doesn't harm SunOS)
      refresh- read then write(doesn't harm data)
      test   - pattern testing(doesn't harm data)
      write    - write then read      (corrupts data)
      compare- write, read, compare (corrupts data)
      purge    - write, read, write   (corrupts data)
      verify   - write entire disk, then verify (corrupts data)
      print    - display data buffer
      setup    - set analysis parameters
      config   - show analysis parameters
      !<cmd>   - execute <cmd> , then return
      quit
analyze> read
Ready to analyze (won't harm SunOS). This takes a long time,
but is interruptable with CTRL-C. Continue? y

      pass 0
   46872/19/599

      pass 1
   46872/19/599

Total of 0 defective blocks repaired.
analyze> quit


FORMAT MENU:
      disk       - select a disk
      type       - select (define) a disk type
      partition- select (define) a partition table
      current    - describe the current disk
      format   - format and analyze the disk
      repair   - repair a defective sector
      label      - write label to the disk
      analyze    - surface analysis
      defect   - defect list management
      backup   - search for backup labels
      verify   - read and display labels
      save       - save new disk/partition definitions
      inquiry    - show vendor, product and revision
      volname    - set 8-character volume name
      !<cmd>   - execute <cmd>, then return
      quit
format> quit

(root) # metasync d30      D32的卷开始同步D31的数据,但是同步到3%就不同步了。
(root) # metareplace -e d30 c0t1d0s0                  修复不了
metareplace: zunyisms: d30: c0t1d0s0: component in invalid state to replace - Replace "Maintenance" components first
(root) # metastat -p
d35 -m d36 d37 1
d36 1 1 c0t1d0s1
d37 1 1 c1t1d0s1
d30 -m d31 d32 1
d31 1 1 c0t1d0s0
d32 1 1 c1t1d0s0
d25 -m d26 d27 1
d26 1 1 c0t0d0s6
d27 1 1 c1t0d0s6
d20 -m d21 d22 1
d21 1 1 c0t0d0s4
d22 1 1 c1t0d0s4
d15 -m d16 d17 1
d16 1 1 c0t0d0s3
d17 1 1 c1t0d0s3
d10 -m d11 d12 1
d11 1 1 c0t0d0s1
d12 1 1 c1t0d0s1
d5 -m d6 d7 1
d6 1 1 c0t0d0s0
d7 1 1 c1t0d0s0
d40 -m d41 d42 1
d41 1 2 /dev/dsk/c9t60080E50002FD908000004B2521931BDd0s0 /dev/dsk/c9t60080E50002FD908000004B6521931EFd0s0 -i 256b
d42 1 2 /dev/dsk/c9t60080E50002FD908000004BA52193228d0s0 /dev/dsk/c9t60080E50002FD908000004BE5219325Dd0s0 -i 256b
(root) # iostat -E
sd0       Soft Errors: 0 Hard Errors: 44 Transport Errors: 12
Vendor: HITACHIProduct: H109030SESUN300G Revision: A31A Serial No: 1307C50KPF
Size: 300.00GB <300000000000 bytes>
Media Error: 38 Device Not Ready: 0 No Device: 6 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd1       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: TEAC   Product: DV-W28SS-V       Revision: 1.0B Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 1 Predictive Failure Analysis: 0
sd2       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: HITACHIProduct: H109030SESUN300G Revision: A31A Serial No: 1307C5HJYF
Size: 300.00GB <300000000000 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd3       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: HITACHIProduct: H109030SESUN300G Revision: A31A Serial No: 1307C4E6MF
Size: 300.00GB <300000000000 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd6       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: HITACHIProduct: H109030SESUN300G Revision: A31A Serial No: 1307C53HKF
Size: 300.00GB <300000000000 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
ssd0      Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SUN      Product: LCSM100_F      Revision: 0784 Serial No:
Size: 2398.03GB <2398034860544 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
ssd1      Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SUN      Product: LCSM100_F      Revision: 0784 Serial No:
Size: 2398.03GB <2398034860544 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
ssd2      Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SUN      Product: LCSM100_F      Revision: 0784 Serial No:
Size: 2398.03GB <2398034860544 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 1 Predictive Failure Analysis: 0
ssd3      Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SUN      Product: LCSM100_F      Revision: 0784 Serial No:
Size: 2398.03GB <2398034860544 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 1 Predictive Failure Analysis: 0
st0       Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: HP       Product: Ultrium 5-SCSI   Revision: Z55S Serial No:
系统日志里有这样的报错

May 20 10:32:56 zunyisms scsi: WARNING: /pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@1,0 (sd0):
May 20 10:32:56 zunyisms      Error for Command: read(10)                Error Level: Retryable
May 20 10:32:56 zunyisms scsi: Requested Block: 88989716                  Error Block: 88989812
May 20 10:32:56 zunyisms scsi: Vendor: HITACHI                            Serial Number: 1307C50KPF
May 20 10:32:56 zunyisms scsi: Sense Key: Media_Error
May 20 10:32:56 zunyisms scsi: ASC: 0x11 (unrecovered read error), ASCQ: 0x0, FRU: 0x0
May 20 10:32:59 zunyisms scsi: WARNING: /pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@1,0 (sd0):
May 20 10:32:59 zunyisms      Error for Command: write(10)               Error Level: Retryable
May 20 10:32:59 zunyisms scsi: Requested Block: 93862340                  Error Block: 93862340
May 20 10:32:59 zunyisms scsi: Vendor: HITACHI                            Serial Number: 1307C50KPF
May 20 10:32:59 zunyisms scsi: Sense Key: Unit_Attention
May 20 10:32:59 zunyisms scsi: ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x17
May 20 10:33:00 zunyisms scsi: WARNING: /pci@0,600000/pci@0/pci@8/pci@0/scsi@1/sd@1,0 (sd0):
May 20 10:33:00 zunyisms      Error for Command: read(10)                Error Level: Retryable
May 20 10:33:00 zunyisms scsi: Requested Block: 88989716                  Error Block: 88989812
May 20 10:33:00 zunyisms scsi: Vendor: HITACHI                            Serial Number: 1307C50KPF
May 20 10:33:00 zunyisms scsi: Sense Key: Media_Error
May 20 10:33:00 zunyisms scsi: ASC: 0x11 (unrecovered read error), ASCQ: 0x0, FRU: 0x0

kwtip 发表于 2014-05-20 13:33

不要光看不回好不,难道要拆开重做吗?

byuq 发表于 2014-05-21 13:11

kwtip 发表于 2014-05-20 13:33 static/image/common/back.gif
不要光看不回好不,难道要拆开重做吗?



你还真说对了,要拆开重做。
建议是先换 c0t1d0   Serial Number: 1307C50KPF的硬盘,然后重做RAID1

anthonypaopao 发表于 2014-05-23 21:18

本帖最后由 anthonypaopao 于 2014-05-23 21:22 编辑

我觉得应该先换d32吧 有lasterr的应该是后坏的 况且invoke里不是说先更换maintenance 然后在metareplace么?
个人感觉自己没遇到这样的情况只能说明维护不及时
Needs maintenance A problem hasbeendetected.This
                            requires that the system administra-
                            tor replace the failed physical dev-
                            ice.    Volumes   displaying   Needs
                            maintenance haveincurrednodata
                            loss,althoughadditional failures
                            could risk data loss. Take action as
                            quickly as possible.

   Last erred          A problem hasbeendetected.Data
                           lossisapossibility. This might
                           occur if a component of asubmirror
                           failsandis not replaced by a hot
                           spare, thereforegoingintoNeeds
                           maintenance   state.   If    the
                           corresponding component alsofails,
                           itwouldgointo Last erred state
                           and, as there is no remainingvalid
                           datasource,dataloss could be a
                           possibility.
页: [1]
查看完整版本: SVM卷报Last Erred修不了急