Chinaunix

标题: request E3500 fiber harddisk [打印本页]

作者: dingdingA8    时间: 2008-06-04 10:32
标题: request E3500 fiber harddisk
I have got the harddisk, but we found this disk has hardware errror.
According to my experience, this disk need to be replaced.  
Pls. find the below error information.


The output of 'iostat -En':

c0t3d0          Soft Errors: 71 Hard Errors: 6076 Transport Errors: 2033
Vendor: SEAGATE  Product: ST336605FSUN36G  Revision: 0438 Serial No: 0132P0FNNG
Size: 36.42GB <36418595328 bytes>
Media Error: 3575 Device Not Ready: 18 No Device: 2008 Recoverable: 71
Illegal Request: 0 Predictive Failure Analysis: 11


Notes: After reboot, the errors will change to 0. But some time later, all errors will increase so many.


The output of 'metastat':
# metastat d30
d30: Mirror
    Submirror 0: d31
      State: Needs maintenance
    Submirror 1: d32
      State: Okay         
    Pass: 1
    Read option: roundrobin (default)
    Write option: parallel (default)
    Size: 33590403 blocks (16 GB)


d31: Submirror of d30
    State: Needs maintenance
    Invoke: metareplace d30 d33 <new device>
    Size: 33590403 blocks (16 GB)
    Stripe 0:
        Device   Start Block  Dbase        State Reloc Hot Spare
        d33             0     No     Maintenance   No  


d33: Soft Partition
    Device: c0t3d0s0
    State: Error
    Size: 33591296 blocks (16 GB)
        Device     Start Block  Dbase Reloc
        c0t3d0s0       2889     No    Yes


        Extent              Start Block              Block count
             0                     2890                 33591296


Notes: After invoke 'metareplace' command, the state is still 'Needs maintenance'.


The error information in the 'messages' file:
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.warning] WARNING: [url=]/sbus@3,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000004cf204614,0[/url] (ssd:
Jun  3 13:09:04 somcsys2        Error for Command: write(10)               Error Level: Fatal
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  Requested Block: 5857354                   Error Block: 5857370
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                            Serial Number: 0132P0FNNG  
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  Sense Key: Hardware Error
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  ASC: 0x3 (peripheral device write fault), ASCQ: 0x0, FRU: 0x10
Jun  3 13:09:04 somcsys2 md_stripe: [ID 641072 kern.warning] WARNING: md: d31: write error on /dev/md/dsk/d33
Jun  3 13:09:04 somcsys2 md_mirror: [ID 104909 kern.warning] WARNING: md: d31: /dev/md/dsk/d33 needs maintenance
Jun  3 13:09:04 somcsys2 md_sp: [ID 641072 kern.warning] WARNING: md: d31: write error on /dev/dsk/c0t3d0s0
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.warning] WARNING:
[url=]/sbus@3,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100000c504edfb8,0[/url] (ssd4):
Jun  3 13:09:04 somcsys2        Error for Command: write(10)               Error Level: Retryable
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  Requested Block: 71118592                  Error Block: 71118592
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                            Serial Number: 0333A2H01E  
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  Sense Key: Unit Attention
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x3
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.warning] WARNING:
[url=]/sbus@3,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000004cf204614,0[/url] (ssd:
Jun  3 13:09:04 somcsys2        Error for Command: mode sense              Error Level: Informational
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  Requested Block: 0                         Error Block: 0
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                            Serial Number: 0132P0FNNG  
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  Sense Key: Soft Error
Jun  3 13:09:04 somcsys2 scsi: [ID 107833 kern.notice]  ASC: 0x5d (drive operation marginal, service immediately (failure prediction threshold exceeded)), ASCQ: 0x0, FRU: 0x32

作者: dingdingA8    时间: 2008-06-04 10:45
个位兄弟看看 是不是需要换一块新的硬盘呢?
作者: metor78    时间: 2008-06-04 10:55
为了保险起见,换一个吧,反正是做的raid。
作者: haishui    时间: 2008-06-04 21:23
赶紧换了吧!都有media error 了 啊!iostat 有计数器的功能的,after reboot it's  will change to 0。
作者: 风之幻想    时间: 2008-06-05 09:24
建议更换硬盘.
作者: wuxinghui403    时间: 2008-06-06 13:49
标题: 回复 #1 dingdingA8 的帖子
如果硬盘用了很长时间了,比如几年了就把他换了,不长的话,用metareplace -e 恢复一就OK了,那快盘hardware error挺多的,还是换新盘吧

[ 本帖最后由 wuxinghui403 于 2008-6-6 13:52 编辑 ]




欢迎光临 Chinaunix (http://bbs.chinaunix.net/) Powered by Discuz! X3.2