oushitianxia 发表于 2012-03-19 10:29

solaris 10双机数据库故障

本帖最后由 oushitianxia 于 2012-03-21 09:02 编辑

hnlt-vpdn1#cat /var/adm/messages
Mar 16 11:25:48 hnlt-vpdn1 ftpd: FTP LOGIN REFUSED (username in /etc/ftpd/ftpusers) FROM 119.39.227.98 , root
Mar 18 05:12:40 hnlt-vpdn1 qlc: NOTICE: Qlogic qlc(0): Loop OFFLINE
Mar 18 05:12:57 hnlt-vpdn1 qlc: NOTICE: Qlogic qlc(0): Loop ONLINE
Mar 18 05:12:57 hnlt-vpdn1 fctl: WARNING: fctl(2): AL_PA=0xef doesn't exist in LILP map
Mar 18 05:13:17 hnlt-vpdn1 scsi: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0 (fcp2):
Mar 18 05:13:17 hnlt-vpdn1      offlining lun=5 (trace=0), target=ef (trace=2800004)
Mar 18 05:13:17 hnlt-vpdn1 scsi: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0 (fcp2):
Mar 18 05:13:17 hnlt-vpdn1      offlining lun=4 (trace=0), target=ef (trace=2800004)
Mar 18 05:13:17 hnlt-vpdn1 scsi: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0 (fcp2):
Mar 18 05:13:17 hnlt-vpdn1      offlining lun=3 (trace=0), target=ef (trace=2800004)
Mar 18 05:13:17 hnlt-vpdn1 scsi: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0 (fcp2):
Mar 18 05:13:17 hnlt-vpdn1      offlining lun=2 (trace=0), target=ef (trace=2800004)
Mar 18 05:13:17 hnlt-vpdn1 scsi: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0 (fcp2):
Mar 18 05:13:17 hnlt-vpdn1      offlining lun=1 (trace=0), target=ef (trace=2800004)
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 genunix: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,5 (ssd4) offline
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      transport rejected fatal error
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 ufs: WARNING: Error writing master during ufs log roll
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 ufs: WARNING: ufs log for /u01 changed state to Error
Mar 18 05:13:17 hnlt-vpdn1 ufs: WARNING: Please umount(1M) /u01 and run fsck(1M)
Mar 18 05:13:17 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,0 (ssd3):
Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 scsi: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0 (fcp2):
Mar 18 05:13:17 hnlt-vpdn1      offlining lun=0 (trace=0), target=ef (trace=2800004)
Mar 18 05:13:21 hnlt-vpdn1 qlc: NOTICE: Qlogic qlc(0): Loop OFFLINE
Mar 18 05:13:22 hnlt-vpdn1 qlc: NOTICE: Qlogic qlc(0): Loop ONLINE
Mar 18 05:13:35 hnlt-vpdn1 scsi: ssd4 at fp2: name w202400a0b84938a2,5, bus address ef
Mar 18 05:13:35 hnlt-vpdn1 genunix: ssd4 is /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,5
Mar 18 05:13:35 hnlt-vpdn1 genunix: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,5 (ssd4) online
Mar 18 05:13:57 hnlt-vpdn1 AgentFramework: VCS ERROR V-16-1-13067 Thread(3) Agent is calling clean for resource(ORACLE_oracle) because the resource became OFFLINE unexpectedly, on its own.
Mar 18 05:13:57 hnlt-vpdn1 Had: VCS ERROR V-16-1-13067 (hnlt-vpdn1) Agent is calling clean for resource(ORACLE_oracle) because the resource became OFFLINE unexpectedly, on its own.
Mar 18 05:13:57 hnlt-vpdn1 Had: VCS ERROR V-16-1-1 (hnlt-vpdn1) Oracle:ORACLE_oracle:clean:Oracle home directory /u01/app/oracle/product/10.2.0 does not exist
Mar 18 05:13:57 hnlt-vpdn1 Had: VCS ERROR V-16-1-2 (hnlt-vpdn1) Oracle:ORACLE_oracle:clean:sqlplus/svrmgrl not found in /u01/app/oracle/product/10.2.0/bin
Mar 18 05:14:01 hnlt-vpdn1 AgentFramework: VCS ERROR V-16-1-13067 Thread(3) Agent is calling clean for resource(ORACLE_listener) because the resource became OFFLINE unexpectedly, on its own.
Mar 18 05:14:01 hnlt-vpdn1 Had: VCS ERROR V-16-1-13067 (hnlt-vpdn1) Agent is calling clean for resource(ORACLE_listener) because the resource became OFFLINE unexpectedly, on its own.
Mar 18 05:14:01 hnlt-vpdn1 Had: VCS CRITICAL V-16-1-41 (hnlt-vpdn1) Netlsnr:ORACLE_listener:clean:Listener process LISTENER not running
Mar 18 05:14:02 hnlt-vpdn1 AgentFramework: VCS ERROR V-16-1-13068 Thread(3) Resource(ORACLE_listener) - clean completed successfully.
Mar 18 05:14:28 hnlt-vpdn1 AgentFramework: VCS ERROR V-16-1-13068 Thread(3) Resource(ORACLE_oracle) - clean completed successfully.
Mar 18 05:14:29 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,1 (ssd2):
Mar 18 05:14:29 hnlt-vpdn1      Error for Command: write(10)               Error Level: Retryable
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Requested Block: 1520                      Error Block: 1520
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Vendor: IBM                              Serial Number:            
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Sense Key: Unit Attention
Mar 18 05:14:29 hnlt-vpdn1 scsi:       ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Mar 18 05:14:29 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,2 (ssd7):
Mar 18 05:14:29 hnlt-vpdn1      Error for Command: write(10)               Error Level: Retryable
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Requested Block: 80597                     Error Block: 80597
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Vendor: IBM                              Serial Number:            
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Sense Key: Unit Attention
Mar 18 05:14:29 hnlt-vpdn1 scsi:       ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Mar 18 05:14:29 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,3 (ssd6):
Mar 18 05:14:29 hnlt-vpdn1      Error for Command: write(10)               Error Level: Retryable
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Requested Block: 121545                  Error Block: 121545
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Vendor: IBM                              Serial Number:            
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Sense Key: Unit Attention
Mar 18 05:14:29 hnlt-vpdn1 scsi:       ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Mar 18 05:14:29 hnlt-vpdn1 scsi: WARNING: /pci@0/pci@0/pci@8/pci@0/pci@9/SUNW,qlc@0/fp@0,0/ssd@w202400a0b84938a2,4 (ssd5):
Mar 18 05:14:29 hnlt-vpdn1      Error for Command: write(10)               Error Level: Retryable
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Requested Block: 1520                      Error Block: 1520
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Vendor: IBM                              Serial Number:            
Mar 18 05:14:29 hnlt-vpdn1 scsi:       Sense Key: Unit Attention
Mar 18 05:14:29 hnlt-vpdn1 scsi:       ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Mar 18 05:14:32 hnlt-vpdn1 Had: VCS ERROR V-16-1-10205 Group oracle is faulted on system hnlt-vpdn1
Mar 18 05:16:37 hnlt-vpdn1 Had: VCS ERROR V-16-1-13066 (hnlt-vpdn2) Agent is calling clean for resource(ORACLE_u02) because the resource is not up even after online completed.
Mar 18 05:16:37 hnlt-vpdn1 Had: VCS ERROR V-16-1-13066 (hnlt-vpdn2) Agent is calling clean for resource(ORACLE_u04) because the resource is not up even after online completed.
Mar 18 05:16:37 hnlt-vpdn1 Had: VCS ERROR V-16-1-13066 (hnlt-vpdn2) Agent is calling clean for resource(ORACLE_ora_backup) because the resource is not up even after online completed.
Mar 18 05:16:37 hnlt-vpdn1 Had: VCS ERROR V-16-1-13066 (hnlt-vpdn2) Agent is calling clean for resource(ORACLE_u01) because the resource is not up even after online completed.
Mar 18 05:16:37 hnlt-vpdn1 Had: VCS ERROR V-16-1-13066 (hnlt-vpdn2) Agent is calling clean for resource(ORACLE_u03) because the resource is not up even after online completed.
Mar 18 05:16:38 hnlt-vpdn1 Had: VCS ERROR V-16-1-10303 Resource ORACLE_u04 (Owner: unknown, Group: oracle) is FAULTED (timed out) on sys hnlt-vpdn2
Mar 18 05:16:38 hnlt-vpdn1 Had: VCS ERROR V-16-1-10303 Resource ORACLE_u02 (Owner: unknown, Group: oracle) is FAULTED (timed out) on sys hnlt-vpdn2
Mar 18 05:16:38 hnlt-vpdn1 Had: VCS ERROR V-16-1-10303 Resource ORACLE_ora_backup (Owner: unknown, Group: oracle) is FAULTED (timed out) on sys hnlt-vpdn2
Mar 18 05:16:38 hnlt-vpdn1 Had: VCS ERROR V-16-1-10303 Resource ORACLE_u01 (Owner: unknown, Group: oracle) is FAULTED (timed out) on sys hnlt-vpdn2
Mar 18 05:16:38 hnlt-vpdn1 Had: VCS ERROR V-16-1-10303 Resource ORACLE_u03 (Owner: unknown, Group: oracle) is FAULTED (timed out) on sys hnlt-vpdn2
Mar 18 05:16:41 hnlt-vpdn1 Had: VCS ERROR V-16-1-10205 Group oracle is faulted on system hnlt-vpdn2

tzpi2000 发表于 2012-03-19 10:43

Mar 18 05:12:40 hnlt-vpdn1 qlc: NOTICE: Qlogic qlc(0): Loop OFFLINE
Mar 18 05:12:57 hnlt-vpdn1 qlc: NOTICE: Qlogic qlc(0): Loop ONLINE

hejia0105 发表于 2012-03-19 10:52

Mar 18 05:13:17 hnlt-vpdn1      Command failed to complete...Device is gone
Mar 18 05:13:17 hnlt-vpdn1 ufs: WARNING: ufs log for /u01 changed state to Error
Mar 18 05:13:17 hnlt-vpdn1 ufs: WARNING: Please umount(1M) /u01 and run fsck(1M)
当前节点的/u01文件系统坏了,建议你执行fsck

另外VCS发现oracle资源offline,自动切换。

milujite 发表于 2012-03-19 12:33

文件系统都异常了,VCS当然切换了

oushitianxia 发表于 2012-03-19 12:55

回复 2# tzpi2000


这是网卡异常啊

oushitianxia 发表于 2012-03-19 12:58

回复 4# milujite


这个VCS之前就没通过测试就上线了
/U01 是在盘陈上面
如果VCS是好的,通过切换到备机,可以避免故障吗?

solaris study 发表于 2012-03-20 16:21

看不到呢

oushitianxia 发表于 2012-03-21 09:02

回复 7# solaris study


现在好了   

send_linux 发表于 2012-03-21 22:41

oushitianxia 发表于 2012-03-21 09:02 static/image/common/back.gif
回复 7# solaris study




欢迎分享一下,咋解决这个问题的哈,呵呵
页: [1]
查看完整版本: solaris 10双机数据库故障