E2900 boot之后有报错!
本帖最后由 yulemi 于 2013-05-27 14:27 编辑{/N0/SB0/P0/C0} Subtest: MTAG ECC errors test
{/N0/SB0/P2/C0} Subtest: MTAG ECC errors test
{/N0/SB0/P1/C0} Subtest: MTAG ECC errors test
{/N0/SB0/P3/C0} Subtest: MTAG ECC errors test
{/N0/SB0/P0/C1} Subtest: Fast ECC errors test
{/N0/SB0/P2/C1} Subtest: Fast ECC errors test
{/N0/SB0/P1/C1} Subtest: Fast ECC errors test
{/N0/SB0/P3/C1} Subtest: Fast ECC errors test
{/N0/SB0/P0/C0} Subtest: SYSTEM ECC errors test
{/N0/SB0/P1/C0} Subtest: SYSTEM ECC errors test
{/N0/SB0/P2/C0} Subtest: SYSTEM ECC errors test
{/N0/SB0/P3/C0} Subtest: SYSTEM ECC errors test
{/N0/SB0/P2/C1} Subtest: MTAG ECC errors test
{/N0/SB0/P3/C1} Subtest: MTAG ECC errors test
{/N0/SB0/P0/C1} Subtest: MTAG ECC errors test
{/N0/SB0/P1/C1} Subtest: MTAG ECC errors test
{/N0/SB0/P0/C0} Subtest: Ecache Tag ECC errors test
{/N0/SB0/P2/C0} Subtest: Ecache Tag ECC errors test
{/N0/SB0/P1/C0} Subtest: Ecache Tag ECC errors test
{/N0/SB0/P3/C0} Subtest: Ecache Tag ECC errors test
{/N0/SB0/P0/C1} Subtest: SYSTEM ECC errors test
{/N0/SB0/P2/C1} Subtest: SYSTEM ECC errors test
{/N0/SB0/P1/C1} Subtest: SYSTEM ECC errors test
{/N0/SB0/P3/C1} Subtest: SYSTEM ECC errors test
{/N0/SB0/P2/C0} Running System Level Tests
{/N0/SB0/P3/C0} Running System Level Tests
{/N0/SB0/P0/C0} Running System Level Tests
{/N0/SB0/P1/C0} Running System Level Tests
{/N0/SB0/P0/C1} Running System Level Tests
{/N0/SB0/P1/C1} Running System Level Tests
{/N0/SB0/P2/C1} Running System Level Tests
{/N0/SB0/P0/C0} Subtest: MP Memory Access Test
{/N0/SB0/P1/C0} Subtest: MP Memory Access Test
{/N0/SB0/P0/C1} Subtest: Ecache Tag ECC errors test
{/N0/SB0/P3/C1} Running System Level Tests
{/N0/SB0/P2/C0} Subtest: MP Memory Access Test
{/N0/SB0/P1/C1} Subtest: Ecache Tag ECC errors test
{/N0/SB0/P0/C1} Subtest: MP Memory Access Test
{/N0/SB0/P1/C1} Subtest: MP Memory Access Test
{/N0/SB0/P3/C0} Subtest: MP Memory Access Test
{/N0/SB0/P2/C1} Subtest: Ecache Tag ECC errors test
{/N0/SB0/P3/C1} Subtest: Ecache Tag ECC errors test
{/N0/SB0/P2/C1} Subtest: MP Memory Access Test
{/N0/SB0/P3/C1} Subtest: MP Memory Access Test
{/N0/SB0/P2/C0} Subtest: Invalidate Caches
{/N0/SB0/P3/C0} Subtest: Invalidate Caches
{/N0/SB0/P0/C0} Subtest: Invalidate Caches
{/N0/SB0/P1/C0} Subtest: Invalidate Caches
{/N0/SB0/P0/C1} Subtest: Invalidate Caches
{/N0/SB0/P1/C1} Subtest: Invalidate Caches
{/N0/SB0/P2/C1} Subtest: Invalidate Caches
{/N0/SB0/P3/C1} Subtest: Invalidate Caches
\{/N0/SB0/P2/C0} Running Board Memory Interleave
{/N0/SB0/P0/C0} Running Board Memory Interleave
{/N0/SB0/P3/C0} Running Board Memory Interleave
{/N0/SB0/P1/C0} Running Board Memory Interleave
{/N0/SB0/P2/C0} Subtest: Board Memory Interleave Configuration
{/N0/SB0/P0/C0} Subtest: Board Memory Interleave Configuration
{/N0/SB0/P3/C0} Subtest: Board Memory Interleave Configuration
{/N0/SB0/P1/C0} Subtest: Board Memory Interleave Configuration
{/N0/SB0/P0/C0} Passed
{/N0/SB0/P1/C0} Passed
{/N0/SB0/P0/C1} Passed
{/N0/SB0/P1/C1} Passed
{/N0/SB0/P2/C0} Passed
{/N0/SB0/P3/C0} Passed
{/N0/SB0/P2/C1} Passed
{/N0/SB0/P3/C1} Passed
{/N0/SB0/P0} Passed
{/N0/SB0/P1} Passed
{/N0/SB0/P2} Passed
{/N0/SB0/P3} Passed
Testing IO Boards ...
Copying IO PROM to CPU DRAM
.....................................................
{/N0/SB0/P0/C0} Running PCI IO Controller Basic Tests
{/N0/SB0/P0/C0} Jumping to memory 00000000.00000020
{/N0/SB0/P0/C0} System PCI IO post code running from memory
{/N0/SB0/P0/C0} @(#) lpost
5/26/13 8:32:14 PM Attempt to power up /N0/SB0 failed: /N0/SB0 3.3V DC failed, observed: 0.49 volts
5/26/13 8:32:14 PM sun.serengeti.HpuFailedException: CPU Board V3 at /N0/SB0
5/26/13 8:34:31 PM Data Parity error polling failed. Board will no longer be polled: JtagController.tapIssueCmd:JtagController.tapWait:Path broken between CBH and SDC: SB2.sbbc0.regs.b0 (108000b0)
5/26/13 8:34:32 PM WARNING: Board hotplugged while powered up. This can cause serious system damage.
5/26/13 8:34:32 PM SB2, hotplug status, SB2, module removed (9,16)
5/26/13 8:34:45 PM ErrorMonitor: Domain A has a SYSTEM ERROR
5/26/13 8:34:45 PM ErrorMonitor: Domain A has a SYSTEM ERROR
5/26/13 8:34:45 PM /N0/RP0 encountered the first error
5/26/13 8:34:45 PM
/RP0/ar0:
>>> SafariPortError2 : 0x00008001
AdrPErr : 0x1 Address parity error
FE : 0x1
5/26/13 8:34:45 PM
5/26/13 8:34:45 PM
/RP0/dx0:
>>> Safari Port Error Status 3 : 0x00048000
AccErr : 0x1
FirstError : 0x1
5/26/13 8:34:45 PM
/RP0/dx1:
>>> Safari Port Error Status 2 : 0x20008000
AccErr14 : 0x1
FirstError : 0x1
5/26/13 8:34:45 PM Event: E2900.ASIC.AR.ADR_PERR.10473002
CSN:DomainID: A ADInfo: 1.SCAPP.20.9
Time: Sun May 26 20:59:00 PDT 2013
FRU-List-Count: 1; FRU-PN: 5405489; FRU-SN: 161158; FRU-LOC: /N0/RP0
Recommended-Action: Service action required
5/26/13 8:34:46 PM Event: E2900.ASIC.AR.ADR_PERR.10473002
CSN:DomainID: A ADInfo: 1.SCAPP.20.9
Time: Sun May 26 20:59:00 PDT 2013
FRU-List-Count: 1; FRU-PN: 5405489; FRU-SN: 161158; FRU-LOC: /N0/RP0
Recommended-Action: Service action required
5/26/13 8:34:46 PM A fatal condition is detected on Domain A. Initiating automatic restoration for this domain.
5/26/13 8:34:46 PM A fatal condition is detected on Domain A. Initiating automatic restoration for this domain.
5/26/13 8:34:53 PM PANIC: Fatal Software Error
5/26/13 8:34:53 PM java.lang.NullPointerException
5/26/13 8:34:53 PM at sun.serengeti.diag.EccDiagnostics.processDataPathEvent(Unknown Source)
5/26/13 8:34:54 PM at sun.serengeti.diag.EccDiagnostics.serdParity(Unknown Source)
5/26/13 8:34:54 PM at sun.serengeti.diag.EccDiagnostics.processParityErr(Unknown Source)
5/26/13 8:34:54 PM at sun.serengeti.diag.EccDiagnostics.diagnoseEcc(Unknown Source)
5/26/13 8:34:54 PM at sun.serengeti.diag.EccDiagScheduler$EccAnalyzerThread.run(Unknown Source)
5/26/13 8:34:54 PM A: CycleKeyswitch: Initiating keyswitch: off, domain A.
5/26/13 8:34:54 PM A: CycleKeyswitch: Initiating keyswitch: off, domain A.
5/26/13 8:34:56 PM Keyswitch.detachBoards: sun.serengeti.IllegalParameterException: DetachBoard.invoke: board=2 not found
5/26/13 8:35:07 PM Device voltage problem: /N0/PS0 abnormal state for device: 48 VDC 0 Volt. 0 Value: 0.0 Volts DC
5/26/13 8:35:07 PM /N0/PS0, sensor status, outside acceptable limits (7,1,0x607000b000a0000)
5/26/13 8:35:07 PM SysEvent 5 /N0/PS0, power state, Off (5,0)
5/26/13 8:35:08 PM SysEvent 5 /N0/PS2, power state, Off (5,0)
5/26/13 8:35:08 PM /N0/FT0, fan speed, Off (4,0)
configuring IPv4 interfaces:5/26/13 8:35:08 PM Device voltage problem: /N0/PS3 abnormal state for device: 48 VDC 0 Volt. 0 Value: 0.0 Volts DC
ce0 ce2.
5/26/13 8:35:09 PM /N0/PS3, sensor status, outside acceptable limits (7,1,0x607030b000a0000)
Hostname: wg29002
5/26/13 8:35:09 PM CAUTION: Physically removing the last power supply will cause the system to lose power.
5/26/13 8:35:09 PM SysEvent 5 /N0/PS3, power state, Off (5,0)
5/26/13 8:35:12 PM SysEvent 5 /N0/PS0, power state, On (5,1)
5/26/13 8:35:14 PM Device voltage stabilized: /N0/PS0 normal operating state: 48 VDC 0 Volt. 0 Value: 48.0 Volts DC
5/26/13 8:35:20 PM Attempt to power up /N0/PS1 failed: Lw8PsHpu.setPower: sun.serengeti.FailedHwException: 48V power did not turn on
5/26/13 8:35:20 PM /N0/PS0, sensor status, within acceptable limits (7,2,0x607000b000a0000)
5/26/13 8:35:21 PM SysEvent 5 /N0/PS2, power state, On (5,1)
5/26/13 8:35:22 PM SysEvent 5 /N0/PS3, power state, On (5,1)
5/26/13 8:35:37 PM A: CycleKeyswitch: Initiating keyswitch: on, domain A.
5/26/13 8:35:37 PM A: CycleKeyswitch: Initiating keyswitch: on, domain A.
5/26/13 8:35:37 PM /N0/FT0, fan speed, Low (4,1)
5/26/13 8:35:40 PM Device voltage stabilized: /N0/PS3 normal operating state: 48 VDC 0 Volt. 0 Value: 48.0 Volts DC
5/26/13 8:35:41 PM /N0/PS3, sensor status, within acceptable limits (7,2,0x607030b000a0000)
5/26/13 8:36:00 PM Attempt to power up /N0/PS1 failed: Lw8PsHpu.setPower: sun.serengeti.FailedHwException: 48V power did not turn on
5/26/13 8:48:25 PM PS1, hotplug status, PS1, module removed (9,16)
5/26/13 8:48:35 PM /N0/PS1: Status is Failed
5/26/13 8:48:43 PM Attempt to power up /N0/PS1 failed: Lw8PsHpu.setPower: sun.serengeti.FailedHwException: 48V power did not turn on
5/26/13 8:48:44 PM /N0/PS1, hotplug status, PS1, module inserted (9,17)
ID configuration failed for line (/devices/ssm@0,0/pci@18,600000/SUNW,qlc@1/fp@0,0:fc::100000e002236f90) in file: /etc/cfg/fp/fabric_WWN_map. I/O error
Could not open /dev/rdsk/c5t600A0B80001FEDEA000052974676301Dd0s2 to verify device id.
No such device or address
WARNING: Unexpected token 'w100000e002236f90' on line 203 of /kernel/drv/st.conf
WARNING: Unexpected token 'lun' on line 203 of /kernel/drv/st.conf
WARNING: Unexpected token '=' on line 203 of /kernel/drv/st.conf
WARNING: 'name' property already specified (the ';' may have been omitted on previous spec!) on line 218 of /kernel/drv/st.conf
WARNING: missing name attribute on line 218 of /kernel/drv/st.conf
Could not open /dev/rmt/3l to verify device id.
No such device or address
Could not open /dev/rmt/2l to verify device id.
No such device or address
Could not open /dev/rmt/1l to verify device id.
No such device or address
Booting as part of a cluster
NOTICE: CMM: Node hnwg1 (nodeid = 1) with votecount = 1 added.
NOTICE: CMM: Node wg29002 (nodeid = 2) with votecount = 1 added.
NOTICE: CMM: Quorum device 2 (/dev/did/rdsk/d14s2) added; votecount = 1, bitmask of nodes with configured paths = 0x3.
NOTICE: clcomm: Adapter ce3 constructed
NOTICE: clcomm: Path wg29002:ce3 - hnwg1:ce3 being constructed
NOTICE: clcomm: Adapter ce1 constructed
NOTICE: clcomm: Path wg29002:ce1 - hnwg1:ce1 being constructed
NOTICE: CMM: Node wg29002: attempting to join cluster.
NOTICE: clcomm: Path wg29002:ce1 - hnwg1:ce1 being initiated
NOTICE: clcomm: Path wg29002:ce3 - hnwg1:ce3 being initiated
NOTICE: CMM: Node hnwg1 (nodeid: 1, incarnation #: 1369190319) has become reachable.
NOTICE: clcomm: Path wg29002:ce1 - hnwg1:ce1 online
NOTICE: clcomm: Path wg29002:ce3 - hnwg1:ce3 online
NOTICE: CMM: Cluster has reached quorum.
NOTICE: CMM: Node hnwg1 (nodeid = 1) is up; new incarnation number = 1369190319.
NOTICE: CMM: Node wg29002 (nodeid = 2) is up; new incarnation number = 1369626922.
NOTICE: CMM: Cluster members: hnwg1 wg29002.
NOTICE: CMM: node reconfiguration #74 completed.
NOTICE: CMM: Node wg29002: joined cluster.
ip: joining multicasts failed (18) on clprivnet0 - will use link layer broadcasts for multicast
Could not open /dev/rdsk/c5t600A0B80001FEDEA000052974676301Dd0s2 to verify device id.
No such device or address
Could not open /dev/rmt/3l to verify device id.
No such device or address
Could not open /dev/rmt/2l to verify device id.
No such device or address
Could not open /dev/rmt/1l to verify device id.
No such device or address
The system is coming up.Please wait.
checking ufs filesystems
/dev/md/rdsk/d30: is logging.
Starting VERITAS Private Branch Exchange
Done.
Setting netmask of ce0 to 255.255.255.192
Setting netmask of ce0:1 to 255.255.255.192
Setting netmask of ce2 to 255.255.255.192
May 27 11:55:44 in.mpathd: failback: ioctl (failback): Resource temporarily unavailable
Setting netmask of ce3 to 255.255.255.128
May 27 11:55:44 in.mpathd: failback: ioctl (failback): Resource temporarily unavailable
Setting netmask of ce1 to 255.255.255.128
Setting netmask of clprivnet0 to 255.255.255.0
Setting default IPv4 interface for multicast: add net 224.0/4: gateway wg29002
syslog service starting.
obtaining access to all attached disks
syslogd: line 19: unknown priority name "* @10.154.32.179"
May 27 11:55:47 wg29002 VERITAS: No proxy found.
May 27 11:55:49 wg29002 last message repeated 1 time
May 27 11:55:49 wg29002 xntpd: configure: keyword "perfer" unknown, line ignored
May 27 11:55:51 wg29002 VERITAS: No proxy found.
May 27 11:55:53 wg29002 last message repeated 1 time
May 27 11:55:54 wg29002 sendmail: makeconnection: service "smtp" unknown
May 27 11:55:54 wg29002 sendmail: makeconnection: service "smtp" unknown
vxsmf version 1.2.2.23
VERITAS Software Corporation
Copyright 2005 VERITAS Software Corporation. All rights reserved.
May 27 11:55:55 wg29002 VERITAS: No proxy found.
NetBackup Database Server started.
NetBackup Notification Service started.
NetBackup Enterprise Media Manager started.
NetBackup Resource Broker started.
Media Manager daemons started.
NetBackup request daemon started.
NetBackup compatibility daemon started.
NetBackup Job Manager started.
NetBackup Policy Execution Manager started.
NetBackup Service Layer started.
NetBackup is not configured for clustering.
NetBackup Service Monitor started.
Starting Sun Java(TM) Web Console Version 2.2...
Startup failed: unable to become the user identity "noaccess".
This may be due to one of the following:
- No permissions to read the noaccess home directory.
- No permissions to read the root (/) home directory.
- Invalid login shell for the user noaccess.
上面的报错是什么问题?硬件的状态showcom除了PS1之外,其他都是pass的。
回复 5# yulemi
你停了系统,但是没有停domain,所以直接把SB2一样报错。
/N0/SB0 failed: /N0/SB0 3.3V DC failed, ---------------------SB0电源转换有问题,估计SB0也有问题。
showboard , showcom, showchs-b等看看吧。 是不是有电源故障了?
5/26/13 8:32:14 PM Attempt to power up /N0/SB0 failed: /N0/SB0 3.3V DC failed, observed: 0.49 volts lbseraph 发表于 2013-05-27 18:30 static/image/common/back.gif
是不是有电源故障了?
电源是有一个故障。PS1 lz,
真是牛
我估计是系统还在跑是时候插拔的系统版,从信息看,lz有这方面的操作
另外,信息里边有关于sb0fail的报错,还有sp1的报错
最好把sc下的一些信息也贴出来 yejunlon 发表于 2013-05-27 23:38 static/image/common/back.gif
lz,
真是牛
我估计是系统还在跑是时候插拔的系统版,从信息看,lz有这方面的操作
我是在系统停了之后才拔除的SB2,但是机器还在加电状态。SB0启动之后showcom看都没有问题的,PS1是有问题。 至少更换掉故障换硬盘先,然后正常关闭重新启动一下看看是否还有新的报错。 lbseraph 发表于 2013-05-29 23:18 static/image/common/back.gif
至少更换掉故障换硬盘先,然后正常关闭重新启动一下看看是否还有新的报错。
换硬盘?现在故障都换了,只有SB2有内存和CPU被chs的了 yulemi 发表于 2013-05-30 09:31 static/image/common/back.gif
换硬盘?现在故障都换了,只有SB2有内存和CPU被chs的了
不好意思,打错了,是想说故障电源来着:dizzy: SB2, hotplug status, SB2, module removed SB2有问题
PM Attempt to power up /N0/SB0 failed: /N0/SB0 3.3V DC failed, observed: 0.49 volts SB0的电源转换有问题,
5/26/13 8:48:35 PM /N0/PS1: Status is Failed
PS1电源坏掉了,更换电源或许会解决上面的问题
页:
[1]
2