免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 5742 | 回复: 9
打印 上一主题 下一主题

sun 2900自动重启,还请高手帮忙分析一下 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2010-03-29 12:51 |只看该作者 |倒序浏览
以下是message的内容\r\n这是Oracle RAC(2 node)中的1台,突然某一天自动重启。\r\ncpu,mem都检查正常,Oracle log也没有报任何错误。不知是何原因。\r\n现贴出重启之后的message的内容,各位大侠路过还请瞧瞧,告知小弟一下是什么原因导致机器重启!\r\n万分感谢!!!\r\nMar 22 11:59:42 test2 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.9 Version Generic_118558-28 64-bit\r\nMar 22 11:59:42 test2 genunix: [ID 943905 kern.notice] Copyright 1983-2003 Sun Microsystems, Inc.  All rights reserved.\r\nMar 22 11:59:42 test2 Use is subject to license terms.\r\nMar 22 11:59:42 test2 genunix: [ID 678236 kern.info] Ethernet address = 0:3:ba:14:8b:da\r\nMar 22 11:59:42 test2 unix: [ID 389951 kern.info] mem = 16777216K (0x400000000)\r\nMar 22 11:59:42 test2 unix: [ID 930857 kern.info] avail mem = 16434577408\r\nMar 22 11:59:42 test2 rootnex: [ID 466748 kern.info] root nexus = Sun Fire E2900\r\nMar 22 11:59:42 test2 rootnex: [ID 349649 kern.info] ssm0 at root: SSM Node 0\r\nMar 22 11:59:42 test2 genunix: [ID 936769 kern.info] ssm0 is /ssm@0,0\r\nMar 22 11:59:42 test2 ssm: [ID 349649 kern.info] pci108e,80020 at ssm0: Node 0 Safari id 24 0xc700000\r\nMar 22 11:59:42 test2 genunix: [ID 936769 kern.info] pcisch0 is /ssm@0,0/pci@18,700000\r\nMar 22 11:59:42 test2 pcisch: [ID 370704 kern.info] PCI-device: pci@4, pci_pci1\r\nMar 22 11:59:42 test2 genunix: [ID 936769 kern.info] pci_pci1 is /ssm@0,0/pci@18,700000/pci@4\r\nMar 22 11:59:42 test2 ssm: [ID 349649 kern.info] pci108e,80021 at ssm0: Node 0 Safari id 24 0xc600000\r\nMar 22 11:59:42 test2 genunix: [ID 936769 kern.info] pcisch1 is /ssm@0,0/pci@18,600000\r\nMar 22 11:59:42 test2 ssm: [ID 349649 kern.info] pci108e,80022 at ssm0: Node 0 Safari id 25 0xcf00000\r\nMar 22 11:59:42 test2 genunix: [ID 936769 kern.info] pcisch2 is /ssm@0,0/pci@19,700000\r\nMar 22 11:59:42 test2 ssm: [ID 349649 kern.info] pci108e,80023 at ssm0: Node 0 Safari id 25 0xce00000\r\nMar 22 11:59:42 test2 genunix: [ID 936769 kern.info] pcisch3 is /ssm@0,0/pci@19,600000\r\nMar 22 11:59:42 test2 pci_pci: [ID 370704 kern.info] PCI-device: ide@2, uata0\r\nMar 22 11:59:42 test2 genunix: [ID 936769 kern.info] uata0 is /ssm@0,0/pci@18,700000/pci@4/ide@2\r\nMar 22 11:59:42 test2 scsi: [ID 365881 kern.info] /ssm@0,0/pci@18,700000/scsi@2 (mpt0):\r\nMar 22 11:59:42 test2  Rev. 7 LSI, Inc. 1030 found.\r\nMar 22 11:59:42 test2 scsi: [ID 365881 kern.info] /ssm@0,0/pci@18,700000/scsi@2 (mpt0):\r\nMar 22 11:59:42 test2  mpt0 supports power management.\r\nMar 22 11:59:42 test2 scsi: [ID 365881 kern.info] /ssm@0,0/pci@18,700000/scsi@2 (mpt0):\r\nMar 22 11:59:42 test2  mpt0 Firmware version v1.3.27.0\r\nMar 22 11:59:42 test2 scsi: [ID 365881 kern.info] /ssm@0,0/pci@18,700000/scsi@2 (mpt0):\r\nMar 22 11:59:42 test2  mpt0: IOC Operational.\r\nMar 22 11:59:44 test2 pcisch: [ID 370704 kern.info] PCI-device: scsi@2, mpt0\r\nMar 22 11:59:44 test2 genunix: [ID 936769 kern.info] mpt0 is /ssm@0,0/pci@18,700000/scsi@2\r\nMar 22 11:59:44 test2 scsi: [ID 365881 kern.info] /ssm@0,0/pci@18,700000/scsi@2,1 (mpt1):\r\nMar 22 11:59:44 test2  Rev. 7 LSI, Inc. 1030 found.\r\nMar 22 11:59:44 test2 scsi: [ID 365881 kern.info] /ssm@0,0/pci@18,700000/scsi@2,1 (mpt1):\r\nMar 22 11:59:44 test2  mpt1 supports power management.\r\nMar 22 11:59:44 test2 scsi: [ID 365881 kern.info] /ssm@0,0/pci@18,700000/scsi@2,1 (mpt1):\r\nMar 22 11:59:44 test2  mpt1 Firmware version v1.3.27.0\r\nMar 22 11:59:44 test2 scsi: [ID 365881 kern.info] /ssm@0,0/pci@18,700000/scsi@2,1 (mpt1):\r\nMar 22 11:59:44 test2  mpt1: IOC Operational.\r\nMar 22 11:59:46 test2 pcisch: [ID 370704 kern.info] PCI-device: scsi@2,1, mpt1\r\nMar 22 11:59:46 test2 genunix: [ID 936769 kern.info] mpt1 is /ssm@0,0/pci@18,700000/scsi@2,1\r\nMar 22 11:59:46 test2 scsi: [ID 193665 kern.info] sd30 at uata0: target 0 lun 0\r\nMar 22 11:59:46 test2 genunix: [ID 936769 kern.info] sd30 is /ssm@0,0/pci@18,700000/pci@4/ide@2/sd@0,0\r\nMar 22 11:59:46 test2 scsi: [ID 193665 kern.info] sd0 at mpt0: target 0 lun 0\r\nMar 22 11:59:46 test2 genunix: [ID 936769 kern.info] sd0 is /ssm@0,0/pci@18,700000/scsi@2/sd@0,0\r\nMar 22 11:59:58 test2 scsi: [ID 193665 kern.info] sd1 at mpt0: target 1 lun 0\r\nMar 22 11:59:58 test2 genunix: [ID 936769 kern.info] sd1 is /ssm@0,0/pci@18,700000/scsi@2/sd@1,0\r\nMar 22 12:00:03 test2 swapgeneric: [ID 308332 kern.info] root on /pseudo/md@0:0,30,blk fstype ufs\r\nMar 22 12:00:03 test2 mpxio: [ID 181378 kern.info] /scsi_vhci (scsi_vhci0) multipath capabilities enabled.\r\nMar 22 12:00:03 test2 rootnex: [ID 349649 kern.info] scsi_vhci0 at root\r\nMar 22 12:00:03 test2 genunix: [ID 936769 kern.info] scsi_vhci0 is /scsi_vhci\r\nMar 22 12:00:03 test2 qlc: [ID 308162 kern.info] Qlogic qlc FCA Driver v20060323-2.12 (0)\r\nMar 22 12:00:04 test2 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(0): Loop ONLINE\r\nMar 22 12:00:04 test2 qlc: [ID 694252 kern.info] NOTICE: qlc(0): Firmware version 3.3.18\r\nMar 22 12:00:04 test2 pcisch: [ID 370704 kern.info] PCI-device: SUNW,qlc@1, qlc0\r\nMar 22 12:00:04 test2 genunix: [ID 936769 kern.info] qlc0 is /ssm@0,0/pci@19,700000/SUNW,qlc@1\r\nMar 22 12:00:04 test2 qlc: [ID 308162 kern.info] Qlogic qlc FCA Driver v20060323-2.12 (1)\r\nMar 22 12:00:05 test2 qlc: [ID 630585 kern.info] NOTICE: Qlogic qlc(1): Loop ONLINE\r\nMar 22 12:00:05 test2 qlc: [ID 694252 kern.info] NOTICE: qlc(1): Firmware version 3.3.18\r\nMar 22 12:00:05 test2 pcisch: [ID 370704 kern.info] PCI-device: SUNW,qlc@2, qlc1\r\nMar 22 12:00:05 test2 genunix: [ID 936769 kern.info] qlc1 is /ssm@0,0/pci@19,700000/SUNW,qlc@2\r\nMar 22 12:00:05 test2 pseudo: [ID 129642 kern.info] pseudo-device: fcp0\r\nMar 22 12:00:05 test2 genunix: [ID 936769 kern.info] fcp0 is /pseudo/fcp@0\r\nMar 22 12:00:05 test2 genunix: [ID 936769 kern.info] fp0 is /ssm@0,0/pci@19,700000/SUNW,qlc@1/fp@0,0\r\nMar 22 12:00:05 test2 genunix: [ID 936769 kern.info] fp1 is /ssm@0,0/pci@19,700000/SUNW,qlc@2/fp@0,0\r\nMar 22 12:00:06 test2 scsi: [ID 799468 kern.info] ssd11 at scsi_vhci0: name g6006016084a21a00fa613d356da0db11, bus address g6006016084a21a00fa613d356da0db11\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] ssd11 is /scsi_vhci/ssd@g6006016084a21a00fa613d356da0db11\r\nMar 22 12:00:06 test2 mpxio: [ID 669396 kern.info] /scsi_vhci/ssd@g6006016084a21a00fa613d356da0db11 (ssd11) multipath status: optimal, path /ssm@0,0/pci@19,7000\r\n00/SUNW,qlc@1/fp@0,0 (fp0) to target address: 5006016941e02afe,4 is standby. Load balancing: round-robin\r\nMar 22 12:00:06 test2 scsi: [ID 799468 kern.info] ssd9 at scsi_vhci0: name g6006016084a21a0096ef3d1f6da0db11, bus address g6006016084a21a0096ef3d1f6da0db11\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] ssd9 is /scsi_vhci/ssd@g6006016084a21a0096ef3d1f6da0db11\r\nMar 22 12:00:06 test2 mpxio: [ID 669396 kern.info] /scsi_vhci/ssd@g6006016084a21a0096ef3d1f6da0db11 (ssd9) multipath status: optimal, path /ssm@0,0/pci@19,70000\r\n0/SUNW,qlc@1/fp@0,0 (fp0) to target address: 5006016941e02afe,3 is online. Load balancing: round-robin\r\nMar 22 12:00:06 test2 scsi: [ID 799468 kern.info] ssd10 at scsi_vhci0: name g6006016084a21a00740095ce6da0db11, bus address g6006016084a21a00740095ce6da0db11\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] ssd10 is /scsi_vhci/ssd@g6006016084a21a00740095ce6da0db11\r\nMar 22 12:00:06 test2 mpxio: [ID 669396 kern.info] /scsi_vhci/ssd@g6006016084a21a00740095ce6da0db11 (ssd10) multipath status: optimal, path /ssm@0,0/pci@19,7000\r\n00/SUNW,qlc@1/fp@0,0 (fp0) to target address: 5006016941e02afe,2 is online. Load balancing: round-robin\r\nMar 22 12:00:06 test2 scsi: [ID 799468 kern.info] ssd8 at scsi_vhci0: name g60060160eeb01a00464344071d9fdb11, bus address g60060160eeb01a00464344071d9fdb11\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] ssd8 is /scsi_vhci/ssd@g60060160eeb01a00464344071d9fdb11\r\nMar 22 12:00:06 test2 mpxio: [ID 669396 kern.info] /scsi_vhci/ssd@g60060160eeb01a00464344071d9fdb11 (ssd multipath status: optimal, path /ssm@0,0/pci@19,70000\r\n0/SUNW,qlc@1/fp@0,0 (fp0) to target address: 5006016941e02afe,1 is standby. Load balancing: round-robin\r\nMar 22 12:00:06 test2 scsi: [ID 799468 kern.info] ssd2 at scsi_vhci0: name g60060160eeb01a008c49a860cd9edb11, bus address g60060160eeb01a008c49a860cd9edb11\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] ssd2 is /scsi_vhci/ssd@g60060160eeb01a008c49a860cd9edb11\r\nMar 22 12:00:06 test2 mpxio: [ID 669396 kern.info] /scsi_vhci/ssd@g60060160eeb01a008c49a860cd9edb11 (ssd2) multipath status: optimal, path /ssm@0,0/pci@19,70000\r\n0/SUNW,qlc@1/fp@0,0 (fp0) to target address: 5006016941e02afe,0 is standby. Load balancing: round-robin\r\nMar 22 12:00:06 test2 genunix: [ID 370176 kern.warning] WARNING: forceload of misc/md_trans failed\r\nMar 22 12:00:06 test2 genunix: [ID 370176 kern.warning] WARNING: forceload of misc/md_raid failed\r\nMar 22 12:00:06 test2 genunix: [ID 370176 kern.warning] WARNING: forceload of misc/md_hotspares failed\r\nMar 22 12:00:06 test2 genunix: [ID 370176 kern.warning] WARNING: forceload of misc/md_sp failed\r\nMar 22 12:00:06 test2 ssm: [ID 349649 kern.info] memory-controller0 at ssm0: Node 0 Safari id 0 0x400000 ...\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] mc-us30 is /ssm@0,0/memory-controller@0,400000\r\nMar 22 12:00:06 test2 ssm: [ID 349649 kern.info] memory-controller1 at ssm0: Node 0 Safari id 1 0xc00000 ...\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] mc-us31 is /ssm@0,0/memory-controller@1,400000\r\nMar 22 12:00:06 test2 ssm: [ID 349649 kern.info] memory-controller2 at ssm0: Node 0 Safari id 2 0x1400000 ...\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] mc-us32 is /ssm@0,0/memory-controller@2,400000\r\nMar 22 12:00:06 test2 ssm: [ID 349649 kern.info] memory-controller3 at ssm0: Node 0 Safari id 3 0x1c00000 ...\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] mc-us33 is /ssm@0,0/memory-controller@3,400000\r\nMar 22 12:00:06 test2 pcisch: [ID 370704 kern.info] PCI-device: pci@1, pci_pci0\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] pci_pci0 is /ssm@0,0/pci@18,600000/pci@1\r\nMar 22 12:00:06 test2 pci_pci: [ID 370704 kern.info] PCI-device: bootbus-controller@3, sgsbbc0\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] sgsbbc0 is /ssm@0,0/pci@18,700000/pci@4/bootbus-controller@3\r\nMar 22 12:00:06 test2 pseudo: [ID 129642 kern.info] pseudo-device: sgenv0\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] sgenv0 is /pseudo/sgenv@0\r\nMar 22 12:00:06 test2 unix: [ID 758372 kern.notice] Hardware watchdog enabled\r\nMar 22 12:00:06 test2 unix: [ID 270833 kern.info] cpu0: UltraSPARC-IV+ (portid 0 impl 0x19 ver 0x22 clock 1500 MHz)\r\nMar 22 12:00:06 test2 unix: [ID 270833 kern.info] cpu1: UltraSPARC-IV+ (portid 1 impl 0x19 ver 0x22 clock 1500 MHz)\r\nMar 22 12:00:06 test2 unix: [ID 721127 kern.info] cpu 1 initialization complete - online\r\nMar 22 12:00:06 test2 unix: [ID 270833 kern.info] cpu2: UltraSPARC-IV+ (portid 2 impl 0x19 ver 0x22 clock 1500 MHz)\r\nMar 22 12:00:06 test2 unix: [ID 721127 kern.info] cpu 2 initialization complete - online\r\nMar 22 12:00:06 test2 unix: [ID 270833 kern.info] cpu3: UltraSPARC-IV+ (portid 3 impl 0x19 ver 0x22 clock 1500 MHz)\r\nMar 22 12:00:06 test2 pseudo: [ID 129642 kern.info] pseudo-device: lw80\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] lw80 is /pseudo/lw8@0\r\nMar 22 12:00:06 test2 unix: [ID 721127 kern.info] cpu 3 initialization complete - online\r\nMar 22 12:00:06 test2 unix: [ID 270833 kern.info] cpu512: UltraSPARC-IV+ (portid 0 impl 0x19 ver 0x22 clock 1500 MHz)\r\nMar 22 12:00:06 test2 unix: [ID 721127 kern.info] cpu 512 initialization complete - online\r\nMar 22 12:00:06 test2 unix: [ID 270833 kern.info] cpu513: UltraSPARC-IV+ (portid 1 impl 0x19 ver 0x22 clock 1500 MHz)\r\nMar 22 12:00:06 test2 unix: [ID 721127 kern.info] cpu 513 initialization complete - online\r\nMar 22 12:00:06 test2 unix: [ID 270833 kern.info] cpu514: UltraSPARC-IV+ (portid 2 impl 0x19 ver 0x22 clock 1500 MHz)\r\nMar 22 12:00:06 test2 unix: [ID 721127 kern.info] cpu 514 initialization complete - online\r\nMar 22 12:00:06 test2 unix: [ID 270833 kern.info] cpu515: UltraSPARC-IV+ (portid 3 impl 0x19 ver 0x22 clock 1500 MHz)\r\nMar 22 12:00:06 test2 unix: [ID 721127 kern.info] cpu 515 initialization complete - online\r\nMar 22 12:00:06 test2 lw8: [ID 190882 kern.notice] Unretrieved lom log history follows ...^M\r\nMar 22 12:00:06 test2 ^M\r\nMar 22 12:00:06 test2 lw8: [ID 279603 kern.crit]   3/21/10 8:40:04 PM ErrorMonitor: Domain A has a SYSTEM ERROR^M\r\nMar 22 12:00:06 test2 pseudo: [ID 129642 kern.info] pseudo-device: ntwdt0\r\nMar 22 12:00:06 test2 genunix: [ID 936769 kern.info] ntwdt0 is /pseudo/ntwdt@0\r\nMar 22 12:00:11 test2 lw8: [ID 408585 kern.crit]   3/21/10 8:40:05 PM ErrorMonitor: Domain A has a SYSTEM ERROR^M\r\nMar 22 12:00:11 test2 lw8: [ID 732992 kern.error]   3/21/10 8:40:05 PM /N0/RP2 encountered the first error^M\r\nMar 22 12:00:11 test2 lw8: [ID 435802 kern.error]   3/21/10 8:40:05 PM /N0/RP0 encountered the first error^M\r\nMar 22 12:00:12 test2 lw8: [ID 123901 kern.error]   3/21/10 8:40:05 PM ^M\r\nMar 22 12:00:12 test2 /RP2/ar0: ^M\r\nMar 22 12:00:12 test2 >>> SafariPortError0[0x200] : 0x00008004^M\r\nMar 22 12:00:12 test2                FE [15:15] : 0x1 ^M\r\nMar 22 12:00:12 test2           QUnfErr [02:02] : 0x1 Queue underflow error^M\r\nMar 22 12:00:12 test2 ^M\r\nMar 22 12:00:12 test2 lw8: [ID 537820 kern.error]   3/21/10 8:40:05 PM ^M\r\nMar 22 12:00:12 test2 >>> SafariPortError1[0x210] : 0x00008001^M\r\nMar 22 12:00:12 test2           AdrPErr [00:00] : 0x1 Address parity error^M\r\nMar 22 12:00:12 test2                FE [15:15] : 0x1 ^M\r\nMar 22 12:00:12 test2 ^M\r\nMar 22 12:00:12 test2 lw8: [ID 207586 kern.error]   3/21/10 8:40:05 PM ^M\r\nMar 22 12:00:12 test2 lw8: [ID 727739 kern.error]   3/21/10 8:40:05 PM ArAsic reported first error on /N0/IB6^M\r\nMar 22 12:00:12 test2 lw8: [ID 865822 kern.error]   3/21/10 8:40:05 PM ^M\r\nMar 22 12:00:12 test2 /IB6/ar0: ^M\r\nMar 22 12:00:12 test2 >>> L2CheckError[0x6150] : 0x00009e1e^M\r\nMar 22 12:00:12 test2       CMDVSyncErr [12:09] : 0xf Ports [9:6] command valid mismatched against internal expected command valid^M\r\nMar 22 12:00:12 test2       PreqSyncErr [04:01] : 0xf Ports [9:6] prereq mismatched against internal expected prereq^M\r\nMar 22 12:00:12 test2                FE [15:15] : 0x1 ^M\r\nMar 22 12:00:12 test2 ^M\r\nMar 22 12:00:13 test2 lw8: [ID 407071 kern.error]   3/21/10 8:40:06 PM ^M\r\nMar 22 12:00:13 test2 /RP0/ar0: ^M\r\nMar 22 12:00:13 test2 >>> SafariPortError0[0x200] : 0x00008004^M\r\nMar 22 12:00:13 test2                FE [15:15] : 0x1 ^M\r\nMar 22 12:00:13 test2           QUnfErr [02:02] : 0x1 Queue underflow error^M\r\nMar 22 12:00:13 test2 ^M\r\nMar 22 12:00:13 test2 lw8: [ID 211682 kern.error]   3/21/10 8:40:06 PM ^M\r\nMar 22 12:00:13 test2 lw8: [ID 784451 kern.error]   3/21/10 8:40:08 PM [AD] Event: E2900.ASIC.AR.QUNF.10473020^M\r\nMar 22 12:00:13 test2      CSN:  DomainID: A ADInfo: 1.SCAPP.20.14^M\r\nMar 22 12:00:13 test2      Time: Sun Mar 21 20:50:25 PDT 2010^M\r\nMar 22 12:00:13 test2      FRU-List-Count: 2; FRU-PN: 5411384; FRU-SN: 018370; FRU-LOC: /N0/RP0^M\r\nMar 22 12:00:13 test2                         FRU-PN: 5405489; FRU-SN: 161003; FRU-LOC: /N0/RP2^M\r\nMar 22 12:00:13 test2      Recommended-Action: Service action required^M\r\nMar 22 12:00:13 test2 ^M\r\nMar 22 12:00:13 test2 [AD] Event: E2900.ASIC.AR.ADR_PERR.10473001^M\r\nMar 22 12:00:13 test2      CSN:  DomainID: A ADInfo: 1.SCAPP.20.14^M\r\nMar 22 12:00:13 test2      Time: Sun Mar 21 20:50:25 PDT 2010^M\r\nMar 22 12:00:13 test2      FRU-List-Count: 1; FRU-PN: 5406679; FRU-SN: 005388; FRU-LOC: /N0/SB0^M\r\nMar 22 12:00:13 test2      Recommended-Action: Service action required^M\r\nMar 22 12:00:13 test2 ^M\r\nMar 22 12:00:13 test2 last message repeated 1 time\r\nMar 22 12:00:13 test2 lw8: [ID 792643 kern.error]   3/21/10 8:40:09 PM [AD] Event: E2900.ASIC.AR.QUNF.10473020^M\r\nMar 22 12:00:13 test2      CSN:  DomainID: A ADInfo: 1.SCAPP.20.14^M\r\nMar 22 12:00:13 test2      Time: Sun Mar 21 20:50:25 PDT 2010^M\r\nMar 22 12:00:13 test2      FRU-List-Count: 2; FRU-PN: 5411384; FRU-SN: 018370; FRU-LOC: /N0/RP0^M\r\nMar 22 12:00:13 test2                         FRU-PN: 5405489; FRU-SN: 161003; FRU-LOC: /N0/RP2^M\r\nMar 22 12:00:13 test2      Recommended-Action: Service action required^M\r\nMar 22 12:00:13 test2 ^M\r\nMar 22 12:00:13 test2 [AD] Event: E2900.ASIC.AR.ADR_PERR.10473001^M\r\nMar 22 12:00:13 test2      CSN:  DomainID: A ADInfo: 1.SCAPP.20.14^M\r\nMar 22 12:00:13 test2      Time: Sun Mar 21 20:50:25 PDT 2010^M\r\nMar 22 12:00:13 test2      FRU-List-Count: 1; FRU-PN: 5406679; FRU-SN: 005388; FRU-LOC: /N0/SB0^M\r\nMar 22 12:00:13 test2      Recommended-Action: Service action required^M\r\nMar 22 12:00:13 test2 ^M\r\nMar 22 12:00:13 test2 last message repeated 1 time\r\nMar 22 12:00:14 test2 lw8: [ID 899324 kern.crit]   3/21/10 8:40:09 PM A fatal condition is detected on Domain A. Initiating automatic restoration for this domai\r\nn.^M\r\nMar 22 12:00:14 test2 last message repeated 1 time\r\nMar 22 12:00:14 test2 lw8: [ID 835203 kern.error]   3/21/10 8:40:11 PM Device voltage problem: /N0/SB0 abnormal state for device: Board 0 3.3 VDC 0 Value: 0.47 \r\nVolts DC^M

论坛徽章:
0
2 [报告]
发表于 2010-03-29 12:52 |只看该作者
Mar 22 12:00:14 test2 lw8: [ID 642942 kern.notice]   3/21/10 8:40:12 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: (SdcAsic)As\r\nic.getTemp: Path broken between CBH and SDC: SB0.sdc.10 (12000010)^M\r\nMar 22 12:00:15 test2 lw8: [ID 957336 kern.notice]   3/21/10 8:40:12 PM Device will not be polled^M\r\nMar 22 12:00:15 test2 genunix: [ID 408822 kern.info] NOTICE: ce0: no fault external to device; service available\r\nMar 22 12:00:15 test2 genunix: [ID 611667 kern.info] NOTICE: ce0: xcvr addr:0x01 - link up 1000 Mbps full duplex\r\nMar 22 12:00:15 test2 lw8: [ID 211460 kern.notice]   3/21/10 8:40:12 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: (ArAsic)Asi\r\nc.getTemp: Path broken between CBH and SDC: SB0.ar.10 (12080010)^M\r\nMar 22 12:00:15 test2 lw8: [ID 957336 kern.notice]   3/21/10 8:40:12 PM Device will not be polled^M\r\nMar 22 12:00:15 test2 lw8: [ID 712828 kern.notice]   3/21/10 8:40:12 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: /SB0/dx0: D\r\nxAsic.getTemp: sun.serengeti.jtag.JtagException: JtagController.tapWait:  Path broken between CBH and SDC: SB0.sdc.b0 (120000b0)^M\r\nMar 22 12:00:16 test2 lw8: [ID 957336 kern.notice]   3/21/10 8:40:12 PM Device will not be polled^M\r\nMar 22 12:00:16 test2 lw8: [ID 843900 kern.notice]   3/21/10 8:40:12 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: /SB0/dx1: D\r\nxAsic.getTemp: sun.serengeti.jtag.JtagException: JtagController.tapWait:  Path broken between CBH and SDC: SB0.sdc.b0 (120000b0)^M\r\nMar 22 12:00:16 test2 lw8: [ID 957336 kern.notice]   3/21/10 8:40:12 PM Device will not be polled^M\r\nMar 22 12:00:16 test2 genunix: [ID 408822 kern.info] NOTICE: ce1: no fault external to device; service available\r\nMar 22 12:00:16 test2 genunix: [ID 611667 kern.info] NOTICE: ce1: xcvr addr:0x01 - link up 1000 Mbps full duplex\r\nMar 22 12:00:16 test2 lw8: [ID 974972 kern.notice]   3/21/10 8:40:12 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: /SB0/dx2: D\r\nxAsic.getTemp: sun.serengeti.jtag.JtagException: JtagController.tapWait:  Path broken between CBH and SDC: SB0.sdc.b0 (120000b0)^M\r\nMar 22 12:00:16 test2 lw8: [ID 957336 kern.notice]   3/21/10 8:40:12 PM Device will not be polled^M\r\nMar 22 12:00:17 test2 lw8: [ID 354658 kern.notice]   3/21/10 8:40:13 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: /SB0/dx3: D\r\nxAsic.getTemp: sun.serengeti.jtag.JtagException: JtagController.tapWait:  Path broken between CBH and SDC: SB0.sdc.b0 (120000b0)^M\r\nMar 22 12:00:17 test2 lw8: [ID 990104 kern.notice]   3/21/10 8:40:13 PM Device will not be polled^M\r\nMar 22 12:00:17 test2 lw8: [ID 616019 kern.notice]   3/21/10 8:40:13 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: (RepeaterSb\r\nbcAsic)Asic.getTemp: Path broken between CBH and SDC: SB0.sbbc0.regs.10 (10000010)^M\r\nMar 22 12:00:17 test2 lw8: [ID 990104 kern.notice]   3/21/10 8:40:13 PM Device will not be polled^M\r\nMar 22 12:00:18 test2 lw8: [ID 295763 kern.notice]   3/21/10 8:40:13 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: I2cComm.rea\r\ndCmd:  Path broken between CBH and SDC: SB0.sbbc0.regs.c0 (100000c0)^M\r\nMar 22 12:00:18 test2 lw8: [ID 990104 kern.notice]   3/21/10 8:40:13 PM Device will not be polled^M\r\nMar 22 12:00:18 test2 lw8: [ID 295763 kern.notice]   3/21/10 8:40:13 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: I2cComm.rea\r\ndCmd:  Path broken between CBH and SDC: SB0.sbbc0.regs.c0 (100000c0)^M\r\nMar 22 12:00:18 test2 genunix: [ID 408822 kern.info] NOTICE: ce3: no fault external to device; service available\r\nMar 22 12:00:18 test2 genunix: [ID 611667 kern.info] NOTICE: ce3: xcvr addr:0x01 - link up 1000 Mbps full duplex\r\nMar 22 12:00:18 test2 lw8: [ID 990104 kern.notice]   3/21/10 8:40:13 PM Device will not be polled^M\r\nMar 22 12:00:19 test2 lw8: [ID 295763 kern.notice]   3/21/10 8:40:13 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: I2cComm.rea\r\ndCmd:  Path broken between CBH and SDC: SB0.sbbc0.regs.c0 (100000c0)^M\r\nMar 22 12:00:19 test2 lw8: [ID 122891 kern.notice]   3/21/10 8:40:14 PM Device will not be polled^M\r\nMar 22 12:00:19 test2 lw8: [ID 971013 kern.notice]   3/21/10 8:40:14 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.HpuFailedException: CpuVoltage\r\nA2D.getOutputVoltage: sun.serengeti.CommException: I2cComm.readCmd:  Path broken between CBH and SDC: SB0.sbbc0.regs.c0 (100000c0)^M\r\nMar 22 12:00:19 test2 lw8: [ID 155659 kern.notice]   3/21/10 8:40:15 PM Device will not be polled^M\r\nMar 22 12:00:20 test2 genunix: [ID 408822 kern.info] NOTICE: ce2: no fault external to device; service available\r\nMar 22 12:00:20 test2 genunix: [ID 611667 kern.info] NOTICE: ce2: xcvr addr:0x01 - link up 1000 Mbps full duplex\r\nMar 22 12:00:20 test2 lw8: [ID 295765 kern.notice]   3/21/10 8:40:15 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: I2cComm.rea\r\ndCmd:  Path broken between CBH and SDC: SB0.sbbc0.regs.c0 (100000c0)^M\r\nMar 22 12:00:20 test2 lw8: [ID 155659 kern.notice]   3/21/10 8:40:15 PM Device will not be polled^M\r\nMar 22 12:00:20 test2 lw8: [ID 973061 kern.notice]   3/21/10 8:40:16 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.HpuFailedException: CpuVoltage\r\nA2D.getOutputVoltage: sun.serengeti.CommException: I2cComm.readCmd:  Path broken between CBH and SDC: SB0.sbbc0.regs.c0 (100000c0)^M\r\nMar 22 12:00:20 test2 lw8: [ID 188427 kern.notice]   3/21/10 8:40:16 PM Device will not be polled^M\r\nMar 22 12:00:20 test2 vxdmp: [ID 803759 kern.notice] NOTICE: VxVM vxdmp V-5-0-34 added disk array OTHER_DISKS, datype = OTHER_DISKS\r\nMar 22 12:00:20 test2 lw8: [ID 346877 kern.notice]   3/21/10 8:40:16 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: (RepeaterSb\r\nbcAsic)Asic.getTemp: Path broken between CBH and SDC: SB0.sbbc1.regs.10 (10200010)^M\r\nMar 22 12:00:21 test2 vxdmp: [ID 917986 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 disabled path 118/0x50 belonging to the dmpnode 276/0x30\r\nMar 22 12:00:21 test2 vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 276/0x30\r\nMar 22 12:00:21 test2 vxdmp: [ID 917986 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 disabled path 118/0x48 belonging to the dmpnode 276/0x10\r\nMar 22 12:00:21 test2 vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 276/0x10\r\nMar 22 12:00:21 test2 vxdmp: [ID 917986 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 disabled path 118/0x10 belonging to the dmpnode 276/0x28\r\nMar 22 12:00:21 test2 vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 276/0x28\r\nMar 22 12:00:21 test2 vxdmp: [ID 917986 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 disabled path 118/0x40 belonging to the dmpnode 276/0x18\r\nMar 22 12:00:21 test2 vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 276/0x18\r\nMar 22 12:00:21 test2 vxdmp: [ID 917986 kern.notice] NOTICE: VxVM vxdmp V-5-0-112 disabled path 118/0x58 belonging to the dmpnode 276/0x20\r\nMar 22 12:00:21 test2 vxdmp: [ID 824220 kern.notice] NOTICE: VxVM vxdmp V-5-0-111 disabled dmpnode 276/0x20\r\nMar 22 12:00:21 test2 lw8: [ID 188427 kern.notice]   3/21/10 8:40:16 PM Device will not be polled^M\r\nMar 22 12:00:21 test2 lw8: [ID 295776 kern.notice]   3/21/10 8:40:16 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: I2cComm.rea\r\ndCmd:  Path broken between CBH and SDC: SB0.sbbc1.regs.c0 (102000c0)^M\r\nMar 22 12:00:21 test2 lw8: [ID 188427 kern.notice]   3/21/10 8:40:16 PM Device will not be polled^M\r\nMar 22 12:00:22 test2 lw8: [ID 295776 kern.notice]   3/21/10 8:40:16 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: I2cComm.rea\r\ndCmd:  Path broken between CBH and SDC: SB0.sbbc1.regs.c0 (102000c0)^M\r\nMar 22 12:00:22 test2 lw8: [ID 188427 kern.notice]   3/21/10 8:40:16 PM Device will not be polled^M\r\nMar 22 12:00:22 test2 lw8: [ID 295776 kern.notice]   3/21/10 8:40:16 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: I2cComm.rea\r\ndCmd:  Path broken between CBH and SDC: SB0.sbbc1.regs.c0 (102000c0)^M\r\nMar 22 12:00:22 test2 lw8: [ID 188427 kern.notice]   3/21/10 8:40:16 PM Device will not be polled^M\r\nMar 22 12:00:22 test2 lw8: [ID 973071 kern.notice]   3/21/10 8:40:16 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.HpuFailedException: CpuVoltage\r\nA2D.getOutputVoltage: sun.serengeti.CommException: I2cComm.readCmd:  Path broken between CBH and SDC: SB0.sbbc1.regs.c0 (102000c0)^M\r\nMar 22 12:00:23 test2 lw8: [ID 188427 kern.notice]   3/21/10 8:40:16 PM Device will not be polled^M\r\nMar 22 12:00:23 test2 lw8: [ID 295778 kern.notice]   3/21/10 8:40:18 PM CPU Board V3 at /N0/SB0 Device poll caused: sun.serengeti.FailedHwException: I2cComm.rea\r\ndCmd:  Path broken between CBH and SDC: SB0.sbbc1.regs.c0 (102000c0)^M

论坛徽章:
0
3 [报告]
发表于 2010-03-29 12:53 |只看该作者
Mar 22 12:01:53 test2 ID[vxclust]: [ID 930323 local0.error] sent invalidate to vxconfigd\r\nMar 22 12:01:53 test2 ID[vxclust]: [ID 452856 local0.error] ending step step4 time: 03/22 12:01:53.129: \r\nMar 22 12:01:55 test2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_framework_start> completed successfully for resource <rac-framework-rs>, resou\r\nrce group <rac-rg>, time used: 2% of timeout <600 seconds>\r\nMar 22 12:01:55 test2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_framework_monitor_start> for resource <rac-framework-rs>, resource g\r\nroup <rac-rg>, timeout <3600> seconds\r\nMar 22 12:01:55 test2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_udlm_start> for resource <rac-udlm-rs>, resource group <rac-rg>, tim\r\neout <600> seconds\r\nMar 22 12:01:55 test2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_cvm_start> for resource <rac-cvm-rs>, resource group <rac-rg>, timeo\r\nut <600> seconds\r\nMar 22 12:01:55 test2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_framework_monitor_start> completed successfully for resource <rac-framework-rs\r\n>, resource group <rac-rg>, time used: 0% of timeout <3600 seconds>\r\nMar 22 12:01:56 test2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_udlm_start> completed successfully for resource <rac-udlm-rs>, resource group \r\n<rac-rg>, time used: 0% of timeout <600 seconds>\r\nMar 22 12:01:56 test2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_udlm_monitor_start> for resource <rac-udlm-rs>, resource group <rac-\r\nrg>, timeout <600> seconds\r\nMar 22 12:01:56 test2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_cvm_start> completed successfully for resource <rac-cvm-rs>, resource group <r\r\nac-rg>, time used: 0% of timeout <600 seconds>\r\nMar 22 12:01:56 test2 Cluster.RGM.rgmd: [ID 707948 daemon.notice] launching method <bin/rac_cvm_monitor_start> for resource <rac-cvm-rs>, resource group <rac-rg\r\n>, timeout <600> seconds\r\nMar 22 12:01:56 test2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_udlm_monitor_start> completed successfully for resource <rac-udlm-rs>, resourc\r\ne group <rac-rg>, time used: 0% of timeout <600 seconds>\r\nMar 22 12:01:56 test2 Cluster.RGM.rgmd: [ID 736390 daemon.notice] method <bin/rac_cvm_monitor_start> completed successfully for resource <rac-cvm-rs>, resource \r\ngroup <rac-rg>, time used: 0% of timeout <600 seconds>

论坛徽章:
7
荣誉会员
日期:2011-11-23 16:44:17水瓶座
日期:2013-08-28 21:20:16丑牛
日期:2013-10-02 21:01:462015年迎新春徽章
日期:2015-03-04 09:54:45操作系统版块每日发帖之星
日期:2016-06-05 06:20:0015-16赛季CBA联赛之吉林
日期:2016-06-20 08:24:0515-16赛季CBA联赛之四川
日期:2016-08-18 15:02:02
4 [报告]
发表于 2010-03-29 13:21 |只看该作者
可能是系统板,硬件故障:
  1. \r\nMar 22 12:00:13 test2      CSN:  DomainID: A ADInfo: 1.SCAPP.20.14^M\r\nMar 22 12:00:13 test2      Time: Sun Mar 21 20:50:25 PDT 2010^M\r\nMar 22 12:00:13 test2      FRU-List-Count: 2; FRU-PN: 5411384; FRU-SN: 018370; FRU-LOC: /N0/RP0^M\r\nMar 22 12:00:13 test2                         FRU-PN: 5405489; FRU-SN: 161003; FRU-LOC: /N0/RP2^M\r\nMar 22 12:00:13 test2      Recommended-Action: Service action required^M\r\nMar 22 12:00:13 test2 ^M\r\nMar 22 12:00:13 test2 [AD] Event: E2900.ASIC.AR.ADR_PERR.10473001^M\r\nMar 22 12:00:13 test2      CSN:  DomainID: A ADInfo: 1.SCAPP.20.14^M\r\nMar 22 12:00:13 test2      Time: Sun Mar 21 20:50:25 PDT 2010^M\r\nMar 22 12:00:13 test2      FRU-List-Count: 1; FRU-PN: 5406679; FRU-SN: 005388; FRU-LOC: /N0/SB0^M\r\n
复制代码

论坛徽章:
0
5 [报告]
发表于 2010-03-29 13:23 |只看该作者
你是怎么确定CPU和内存没有问题的? RP板有报错,system board也有报错.但两个rp板同时故障的可能性较低,所以最可能还是system board 出了问题.最好能做个mem2级别的自检如果还不能确定并且有多个系统板的话,我建议把SB0拔掉进行测试!

论坛徽章:
0
6 [报告]
发表于 2010-03-29 13:39 |只看该作者
本帖最后由 hawker60 于 2010-03-29 13:45 编辑 \n\n除了rp,sb的问题之外,各位还发现了其他问题吗?\r\n比如vxclust,我觉得很怪,为什么在vxclust上会报那么多的err!!!

论坛徽章:
0
7 [报告]
发表于 2010-03-30 10:37 |只看该作者
IO板要是有问题的话 vxvm有报错也正常。。

论坛徽章:
0
8 [报告]
发表于 2010-03-31 15:09 |只看该作者
{:3_185:}学习ing、。、

论坛徽章:
0
9 [报告]
发表于 2010-04-04 19:58 |只看该作者
Sb0板子上的一个电容有问题了,换SB0吧。

论坛徽章:
0
10 [报告]
发表于 2010-04-05 22:30 |只看该作者
逐个排除,内存和CPU 可能性最小。
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP