免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 7519 | 回复: 7
打印 上一主题 下一主题

请教个问题,我把failed的SB板poweoff,为何无法开机? [复制链接]

论坛徽章:
1
操作系统版块每日发帖之星
日期:2015-07-30 09:40:01
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2015-07-22 23:00 |只看该作者 |倒序浏览
今天一台SUN F4800 hang了,用户打电话说无法登陆。到了机房看所有指示灯都是绿色正常,通过串口提示
CpuSafariGroup.flushConsoleBuffer: Path broken between CBH and SDC: SB2.sbbc1.sram.0 (10b00000)
CpuSafariGroup.flushConsoleBuffer: Path broken between CBH and SDC: SB2.sbbc1.sram.0 (10b00000)
CpuSafariGroup.flushConsoleBuffer: Path broken between CBH and SDC: SB2.sbbc0.sram.0 (10900000)
CpuSafariGroup.flushConsoleBuffer: Path broken between CBH and SDC: SB2.sbbc0.sram.0 (10900000)
不停的刷屏。敲回车无反应
ctrl-A 进入->         
**Out of memory, exiting**
进入SC
sdwg11:A> showboards
Slot     Pwr Component Type                 State      Status     Domain
----     --- --------------                 -----      ------     ------
/N0/SB0  On  CPU Board V2                   Active     Passed     A
/N0/SB2  Off CPU Board V2                   Active     Failed     A
/N0/SB4  On  CPU Board V2                   Active     Passed     A
/N0/IB6  On  PCI I/O Board                  Active     Passed     A
/N0/IB8  On  PCI I/O Board                  Active     Passed     A

这是提示突然变成
CpuSafariGroup.flushConsoleBuffer: No Board Power: SB2.sbbc1.sram.0 (10b00000)
CpuSafariGroup.flushConsoleBuffer: No Board Power: SB2.sbbc1.sram.0 (10b00000)
CpuSafariGroup.flushConsoleBuffer: No Board Power: SB2.sbbc0.sram.0 (10900000)
CpuSafariGroup.flushConsoleBuffer: No Board Power: SB2.sbbc0.sram.0 (10900000)


sdwg11:A> showdo
Domain    Solaris Nodename    Domain Status            Keyswitch      
--------  ------------------  -----------------------  -------------  
A         sdwg11              Paused due to an error   on     

diag-level = init
post-tolerate-ce = false
mpr-support-enable = true
verbosity-level = min
error-level = max
interleave-scope = within-board
interleave-mode = optimal
reboot-on-error = true
hang-policy = reset
log-reset-data = true
verbose-reset-data = true
reset-data-ftp-url =
max-panic-diag-limit = mem2
OBP.use-nvramrc? = true
OBP.auto-boot? = true
OBP.error-reset-recovery = <OBP default>

Loghost for Domain A:
Log Facility for Domain A: local0

SNMP Agent: disabled
Domain Description:  
Domain Contact:  

ACL for Domain A: SB0 SB2 SB4 IB6 IB8

PROC RTUs reserved for domain A: 0

sdwg11:A> showkey
keyswitch is: standby
sdwg11:A> showenv   
Domain: A

Slot    Device     Sensor       Value  Units     Age     Status
------- ---------- ------------ ------ --------- ------- ------
/N0/SB0 Board 0    1.5 VDC 0    1.52   Volts DC    5 sec OK
/N0/SB0 Board 0    3.3 VDC 0    3.33   Volts DC    5 sec OK
/N0/SB0 SDC 0      Temp. 0      64     Degrees C   5 sec OK
/N0/SB0 AR 0       Temp. 0      48     Degrees C   5 sec OK
/N0/SB0 DX 0       Temp. 0      60     Degrees C   5 sec OK
/N0/SB0 DX 1       Temp. 0      63     Degrees C   5 sec OK
/N0/SB0 DX 2       Temp. 0      61     Degrees C   5 sec OK
/N0/SB0 DX 3       Temp. 0      57     Degrees C   5 sec OK
/N0/SB0 SBBC 0     Temp. 0      57     Degrees C   5 sec OK
/N0/SB0 Board 1    Temp. 0      29     Degrees C   5 sec OK
/N0/SB0 Board 1    Temp. 1      29     Degrees C   5 sec OK
/N0/SB0 CPU 0      Temp. 0      47     Degrees C   5 sec OK
/N0/SB0 CPU 0      Core 0       1.64   Volts DC    5 sec OK
/N0/SB0 CPU 1      Temp. 0      47     Degrees C   5 sec OK
/N0/SB0 CPU 1      Core 1       1.66   Volts DC    5 sec OK
/N0/SB0 SBBC 1     Temp. 0      45     Degrees C   5 sec OK
/N0/SB0 Board 1    Temp. 2      28     Degrees C   5 sec OK
/N0/SB0 Board 1    Temp. 3      29     Degrees C   5 sec OK
/N0/SB0 CPU 2      Temp. 0      48     Degrees C   5 sec OK
/N0/SB0 CPU 2      Core 2       1.63   Volts DC    5 sec OK
/N0/SB0 CPU 3      Temp. 0      47     Degrees C   5 sec OK
/N0/SB0 CPU 3      Core 3       1.63   Volts DC    5 sec OK
/N0/SB4 Board 0    1.5 VDC 0    1.52   Volts DC    4 sec OK
/N0/SB4 Board 0    3.3 VDC 0    3.33   Volts DC    4 sec OK
/N0/SB4 SDC 0      Temp. 0      75     Degrees C   4 sec OK
/N0/SB4 AR 0       Temp. 0      53     Degrees C   4 sec OK
/N0/SB4 DX 0       Temp. 0      66     Degrees C   4 sec OK
/N0/SB4 DX 1       Temp. 0      70     Degrees C   4 sec OK
/N0/SB4 DX 2       Temp. 0      73     Degrees C   4 sec OK
/N0/SB4 DX 3       Temp. 0      65     Degrees C   4 sec OK
/N0/SB4 SBBC 0     Temp. 0      59     Degrees C   4 sec OK
/N0/SB4 Board 1    Temp. 0      27     Degrees C   4 sec OK
/N0/SB4 Board 1    Temp. 1      29     Degrees C   4 sec OK
/N0/SB4 CPU 0      Temp. 0      41     Degrees C   4 sec OK
/N0/SB4 CPU 0      Core 0       1.41   Volts DC    4 sec OK
/N0/SB4 CPU 1      Temp. 0      42     Degrees C   4 sec OK
/N0/SB4 CPU 1      Core 1       1.42   Volts DC    5 sec OK
/N0/SB4 SBBC 1     Temp. 0      48     Degrees C   5 sec OK
/N0/SB4 Board 1    Temp. 2      27     Degrees C   5 sec OK
/N0/SB4 Board 1    Temp. 3      28     Degrees C   5 sec OK
/N0/SB4 CPU 2      Temp. 0      41     Degrees C   5 sec OK
/N0/SB4 CPU 2      Core 2       1.40   Volts DC    5 sec OK
/N0/SB4 CPU 3      Temp. 0      41     Degrees C   5 sec OK
/N0/SB4 CPU 3      Core 3       1.40   Volts DC    5 sec OK
/N0/SB2 Board 0    1.5 VDC 0    1.50   Volts DC   13 min OK
/N0/SB2 Board 0    3.3 VDC 0    0.64   Volts DC   13 min *** ERROR LOW ***
/N0/SB2 SDC 0      Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 AR 0       Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 DX 0       Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 DX 1       Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 DX 2       Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 DX 3       Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 SBBC 0     Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 Board 1    Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 Board 1    Temp. 1      ????   Degrees C  2 days failed
/N0/SB2 CPU 0      Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 CPU 0      Core 0       ????   Volts DC   2 days failed
/N0/SB2 CPU 1      Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 CPU 1      Core 1       ????   Volts DC   2 days failed
/N0/SB2 SBBC 1     Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 Board 1    Temp. 2      ????   Degrees C  2 days failed
/N0/SB2 Board 1    Temp. 3      ????   Degrees C  2 days failed
/N0/SB2 CPU 2      Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 CPU 2      Core 2       ????   Volts DC   2 days failed
/N0/SB2 CPU 3      Temp. 0      ????   Degrees C  2 days failed
/N0/SB2 CPU 3      Core 3       ????   Volts DC   2 days failed
/N0/IB6 Board 0    1.5 VDC 0    1.50   Volts DC    6 sec OK
/N0/IB6 Board 0    3.3 VDC 0    3.33   Volts DC    6 sec OK
/N0/IB6 Board 0    5 VDC 0      4.95   Volts DC    7 sec OK
/N0/IB6 Board 0    Temp. 0      33     Degrees C   7 sec OK
/N0/IB6 Board 0    Temp. 1      34     Degrees C   7 sec OK
/N0/IB6 Board 0    12 VDC 0     12.03  Volts DC    7 sec OK
/N0/IB6 SDC 0      Temp. 0      63     Degrees C   7 sec OK
/N0/IB6 AR 0       Temp. 0      47     Degrees C   7 sec OK
/N0/IB6 DX 0       Temp. 0      55     Degrees C   7 sec OK
/N0/IB6 DX 1       Temp. 0      51     Degrees C   7 sec OK
/N0/IB6 SBBC 0     Temp. 0      51     Degrees C   7 sec OK
/N0/IB6 IOASIC 0   Temp. 0      56     Degrees C   7 sec OK
/N0/IB6 IOASIC 1   Temp. 1      45     Degrees C   7 sec OK
/N0/IB8 Board 0    1.5 VDC 0    1.51   Volts DC    6 sec OK
/N0/IB8 Board 0    3.3 VDC 0    3.35   Volts DC    6 sec OK
/N0/IB8 Board 0    5 VDC 0      4.95   Volts DC    6 sec OK
/N0/IB8 Board 0    Temp. 0      34     Degrees C   6 sec OK
/N0/IB8 Board 0    Temp. 1      36     Degrees C   6 sec OK
/N0/IB8 Board 0    12 VDC 0     12.11  Volts DC    6 sec OK
/N0/IB8 SDC 0      Temp. 0      67     Degrees C   7 sec OK
/N0/IB8 AR 0       Temp. 0      51     Degrees C   7 sec OK
/N0/IB8 DX 0       Temp. 0      57     Degrees C   7 sec OK
/N0/IB8 DX 1       Temp. 0      55     Degrees C   7 sec OK
/N0/IB8 SBBC 0     Temp. 0      56     Degrees C   7 sec OK
/N0/IB8 IOASIC 0   Temp. 0      63     Degrees C   7 sec OK
/N0/IB8 IOASIC 1   Temp. 1      52     Degrees C   7 sec OK


sdwg11:A>


根据以上我判断是SB2 CPU板坏了,
这时候我决定poweroff掉SB2,把它踢出域
sdwg11:A> poweroff sb2
/N0/SB2: should not be powered off while the state is: Active
/N0/SB2: power off will change domain A keyswitch position to standby
/N0/SB2: Do you want to forcefully power off? [no] yes
/N0/SB2: powered off
sdwg11:A> showb  

Slot     Pwr Component Type                 State      Status     Domain
----     --- --------------                 -----      ------     ------
/N0/SB0  On  CPU Board V2                   Assigned   Passed     A
/N0/SB2  Off CPU Board V2                   Assigned   Not tested A
/N0/SB4  On  CPU Board V2                   Assigned   Passed     A
/N0/IB6  On  PCI I/O Board                  Assigned   Passed     A
/N0/IB8  On  PCI I/O Board                  Assigned   Passed     A
sdwg11:A> deleteboar SB2

准备开机

sdwg11:A> setkey on
Powering boards on ...
Testing CPU Boards ...
{/N0/SB0/P0} Running CPU POR and Set Clocks
{/N0/SB0/P1} Running CPU POR and Set Clocks
{/N0/SB0/P0} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB0/P1} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB0/P0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB0/P1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB0/P0} Use is subject to license terms.
{/N0/SB0/P1} Use is subject to license terms.
{/N0/SB0/P2} Running CPU POR and Set Clocks
{/N0/SB0/P3} Running CPU POR and Set Clocks
{/N0/SB0/P2} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB0/P3} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB0/P2} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB0/P3} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB0/P2} Use is subject to license terms.
{/N0/SB0/P3} Use is subject to license terms.
{/N0/SB4/P0} Running CPU POR and Set Clocks
{/N0/SB4/P1} Running CPU POR and Set Clocks
{/N0/SB4/P0} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB4/P2} Running CPU POR and Set Clocks
{/N0/SB4/P1} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB4/P0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB4/P1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB4/P3} Running CPU POR and Set Clocks
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB4/P1} Use is subject to license terms.
{/N0/SB4/P2} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB4/P3} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB4/P2} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB4/P3} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB4/P2} Use is subject to license terms.
{/N0/SB4/P3} Use is subject to license terms.
{/N0/SB0/P0} Running Basic CPU
{/N0/SB0/P1} Running Basic CPU
{/N0/SB0/P0} Subtest: Setting Fireplane Config Registers
{/N0/SB0/P1} Subtest: Setting Fireplane Config Registers
{/N0/SB0/P2} Running Basic CPU
{/N0/SB0/P0} Subtest: Display CPU Version, frequency
{/N0/SB0/P1} Subtest: Display CPU Version, frequency
{/N0/SB0/P3} Running Basic CPU
{/N0/SB0/P0} Version register = 003e0015.22000507
{/N0/SB0/P2} Subtest: Setting Fireplane Config Registers for aid 0x2
{/N0/SB0/P1} Version register = 003e0015.22000507
{/N0/SB0/P3} Subtest: Setting Fireplane Config Registers for aid 0x3
{/N0/SB0/P2} Subtest: Display CPU Version, frequency
{/N0/SB0/P3} Subtest: Display CPU Version, frequency
{/N0/SB0/P2} Version register = 003e0015.22000507
{/N0/SB0/P3} Version register = 003e0015.22000507
{/N0/SB0/P0} CPU features = 0000224f.004204ff
{/N0/SB4/P0} Running Basic CPU
{/N0/SB4/P2} Running Basic CPU
{/N0/SB4/P1} Running Basic CPU
{/N0/SB0/P2} CPU features = 0000224f.004204ff
{/N0/SB0/P3} CPU features = 0000224f.004204ff
{/N0/SB0/P1} CPU features = 0000224f.004204ff
{/N0/SB4/P3} Running Basic CPU
{/N0/SB4/P2} Subtest: Setting Fireplane Config Registers for aid 0x12
{/N0/SB4/P3} Subtest: Setting Fireplane Config Registers for aid 0x13
{/N0/SB4/P2} Subtest: Display CPU Version, frequency
{/N0/SB4/P0} Subtest: Setting Fireplane Config Registers for aid 0x10
{/N0/SB4/P3} Subtest: Display CPU Version, frequency
{/N0/SB4/P1} Subtest: Setting Fireplane Config Registers for aid 0x11
{/N0/SB0/P2} Ecache Control Register 00000000.07a34c00
{/N0/SB4/P0} Subtest: Display CPU Version, frequency
{/N0/SB0/P3} Ecache Control Register 00000000.07a34c00
{/N0/SB0/P2} Running Test Large Tag Arrays and Enable MMU
{/N0/SB0/P3} Running Test Large Tag Arrays and Enable MMU
{/N0/SB4/P1} Subtest: Display CPU Version, frequency
{/N0/SB4/P0} Version register = 003e0015.b0000507
{/N0/SB4/P1} Version register = 003e0015.b0000507
{/N0/SB0/P0} Running Test Large Tag Arrays and Enable MMU
{/N0/SB0/P1} Running Test Large Tag Arrays and Enable MMU
{/N0/SB0/P0} Ecache Control Register 00000000.07a34c00
{/N0/SB0/P2} Cpu/System ratio = 6, cpu actual frequency = 900
{/N0/SB0/P3} Cpu/System ratio = 6, cpu actual frequency = 900
{/N0/SB0/P2} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB0/P3} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB4/P2} Version register = 003e0015.b0000507
{/N0/SB0/P1} Ecache Control Register 00000000.07a34c00
{/N0/SB0/P0} Cpu/System ratio = 6, cpu actual frequency = 900
{/N0/SB0/P1} Cpu/System ratio = 6, cpu actual frequency = 900
{/N0/SB4/P0} CPU features = 0000225f.005205ff
{/N0/SB0/P2} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB4/P1} CPU features = 0000225f.005205ff
{/N0/SB0/P0} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB0/P3} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB4/P3} Version register = 003e0015.b0000507
{/N0/SB0/P1} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB4/P2} Running Test Large Tag Arrays and Enable MMU
{/N0/SB4/P0} Running Test Large Tag Arrays and Enable MMU
{/N0/SB0/P0} Running FPU Tests
{/N0/SB0/P2} Running FPU Tests
{/N0/SB0/P3} Running FPU Tests
{/N0/SB0/P2} Use is subject to license terms.
{/N0/SB4/P3} Running Test Large Tag Arrays and Enable MMU
{/N0/SB4/P2} CPU features = 0000225f.005205ff
{/N0/SB4/P3} CPU features = 0000225f.005205ff
{/N0/SB4/P1} Running Test Large Tag Arrays and Enable MMU
{/N0/SB4/P0} Ecache Control Register 00000000.07c55400
{/N0/SB0/P1} Running FPU Tests
{/N0/SB0/P0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB0/P1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB0/P3} Use is subject to license terms.
{/N0/SB0/P2} Subtest: I-Cache Initialization
{/N0/SB4/P1} Ecache Control Register 00000000.07c55400
{/N0/SB4/P0} Cpu/System ratio = 8, cpu actual frequency = 1200
{/N0/SB4/P1} Cpu/System ratio = 8, cpu actual frequency = 1200
{/N0/SB0/P0} Use is subject to license terms.
{/N0/SB0/P1} Use is subject to license terms.
{/N0/SB0/P0} Subtest: I-Cache Initialization
{/N0/SB0/P1} Subtest: I-Cache Initialization
{/N0/SB4/P2} Ecache Control Register 00000000.07c55400
{/N0/SB4/P3} Ecache Control Register 00000000.07c55400
{/N0/SB4/P2} Cpu/System ratio = 8, cpu actual frequency = 1200
{/N0/SB4/P3} Cpu/System ratio = 8, cpu actual frequency = 1200
{/N0/SB4/P0} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB4/P1} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB4/P0} Running FPU Tests
{/N0/SB4/P1} Running FPU Tests
{/N0/SB4/P0} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB4/P1} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB4/P0} Use is subject to license terms.
{/N0/SB0/P3} Subtest: I-Cache Initialization
{/N0/SB4/P1} Use is subject to license terms.
{/N0/SB0/P2} Subtest: D-Cache Initialization
{/N0/SB4/P0} Subtest: I-Cache Initialization
{/N0/SB4/P2} Running FPU Tests
{/N0/SB0/P3} Subtest: D-Cache Initialization
{/N0/SB4/P1} Subtest: I-Cache Initialization
{/N0/SB4/P3} Running FPU Tests
{/N0/SB4/P2} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB4/P3} @(#) lpost         5.20.6  2007/05/23 08:54
{/N0/SB4/P2} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB4/P3} Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
{/N0/SB0/P2} Running Basic Ecache
{/N0/SB0/P0} Running Basic Ecache
{/N0/SB4/P2} Use is subject to license terms.
{/N0/SB4/P3} Use is subject to license terms.
{/N0/SB0/P3} Running Basic Ecache
{/N0/SB0/P1} Running Basic Ecache
{/N0/SB0/P2} Subtest: W-Cache Initialization
{/N0/SB0/P0} Subtest: D-Cache Initialization
{/N0/SB0/P3} Subtest: W-Cache Initialization
{/N0/SB0/P1} Subtest: D-Cache Initialization
{/N0/SB0/P2} Subtest: P-Cache Initialization
{/N0/SB0/P0} Subtest: W-Cache Initialization
{/N0/SB0/P3} Subtest: P-Cache Initialization
{/N0/SB0/P1} Subtest: W-Cache Initialization
{/N0/SB0/P2} Subtest: Branch Prediction Initialization
{/N0/SB0/P0} Subtest: P-Cache Initialization
{/N0/SB0/P3} Subtest: Branch Prediction Initialization
{/N0/SB0/P1} Subtest: P-Cache Initialization
{/N0/SB4/P2} Running Basic Ecache
{/N0/SB4/P0} Running Basic Ecache
{/N0/SB4/P3} Running Basic Ecache
{/N0/SB4/P1} Running Basic Ecache
{/N0/SB4/P2} Subtest: I-Cache Initialization
{/N0/SB4/P0} Subtest: D-Cache Initialization
{/N0/SB4/P3} Subtest: I-Cache Initialization
{/N0/SB4/P1} Subtest: D-Cache Initialization
{/N0/SB4/P2} Subtest: D-Cache Initialization
{/N0/SB4/P0} Subtest: W-Cache Initialization
{/N0/SB4/P3} Subtest: D-Cache Initialization
{/N0/SB4/P1} Subtest: W-Cache Initialization
{/N0/SB4/P2} Subtest: W-Cache Initialization
{/N0/SB4/P0} Subtest: P-Cache Initialization
{/N0/SB4/P3} Subtest: W-Cache Initialization
{/N0/SB4/P1} Subtest: P-Cache Initialization
{/N0/SB0/P0} Running Memory Registers Tests
{/N0/SB0/P2} Running Memory Registers Tests
{/N0/SB0/P1} Running Memory Registers Tests
{/N0/SB0/P3} Running Memory Registers Tests
{/N0/SB0/P0} Subtest: Branch Prediction Initialization
{/N0/SB0/P2} Subtest: E-Cache Global Variables Initialization
{/N0/SB0/P1} Subtest: Branch Prediction Initialization
{/N0/SB0/P3} Subtest: E-Cache Global Variables Initialization
{/N0/SB0/P0} Subtest: E-Cache Global Variables Initialization
{/N0/SB0/P2} Subtest: Fast Init. Verification Test
{/N0/SB4/P0} Running Memory Registers Tests
{/N0/SB4/P1} Running Memory Registers Tests
{/N0/SB4/P2} Running Memory Registers Tests
{/N0/SB4/P3} Running Memory Registers Tests
{/N0/SB0/P1} Subtest: E-Cache Global Variables Initialization
{/N0/SB0/P0} Subtest: Fast Init. Verification Test
{/N0/SB0/P1} Subtest: Fast Init. Verification Test
{/N0/SB0/P3} Subtest: Fast Init. Verification Test
{/N0/SB0/P2} Subtest: IMMU Initialization
{/N0/SB0/P3} Subtest: IMMU Initialization
{/N0/SB4/P2} Subtest: P-Cache Initialization
{/N0/SB4/P0} Subtest: Branch Prediction Initialization
{/N0/SB4/P3} Subtest: P-Cache Initialization
{/N0/SB0/P0} Running Memory Configuration Tests
{/N0/SB0/P2} Running Memory Configuration Tests
{/N0/SB0/P1} Running Memory Configuration Tests
{/N0/SB4/P1} Subtest: Branch Prediction Initialization
{/N0/SB0/P3} Running Memory Configuration Tests
{/N0/SB4/P0} Subtest: E-Cache Global Variables Initialization
{/N0/SB4/P2} Subtest: Branch Prediction Initialization
{/N0/SB4/P3} Subtest: Branch Prediction Initialization
{/N0/SB4/P2} Subtest: E-Cache Global Variables Initialization
{/N0/SB4/P1} Subtest: E-Cache Global Variables Initialization
{/N0/SB4/P0} Subtest: Fast Init. Verification Test
{/N0/SB0/P2} Subtest: DMMU Initialization
{/N0/SB0/P3} Subtest: DMMU Initialization
{/N0/SB0/P2} Subtest: Map LPOST to local space
{/N0/SB0/P3} Subtest: Map LPOST to local space
{/N0/SB0/P0} Subtest: IMMU Initialization
{/N0/SB0/P1} Subtest: IMMU Initialization
{/N0/SB0/P0} Subtest: DMMU Initialization
{/N0/SB4/P3} Subtest: E-Cache Global Variables Initialization
{/N0/SB4/P1} Subtest: Fast Init. Verification Test
{/N0/SB0/P2} Subtest: E-Cache Initialization of first 1K
{/N0/SB0/P3} Subtest: E-Cache Initialization of first 1K
{/N0/SB0/P1} Subtest: DMMU Initialization
{/N0/SB0/P0} Subtest: Map LPOST to local space
{/N0/SB0/P1} Subtest: Map LPOST to local space
{/N0/SB4/P2} Running Memory Configuration Tests
{/N0/SB4/P3} Running Memory Configuration Tests
{/N0/SB4/P2} Subtest: Fast Init. Verification Test
{/N0/SB4/P3} Subtest: Fast Init. Verification Test
{/N0/SB4/P2} Subtest: IMMU Initialization
{/N0/SB0/P0} Subtest: E-Cache Initialization of first 1K
{/N0/SB0/P2} Subtest: E-Cache Initialization
{/N0/SB4/P3} Subtest: IMMU Initialization
{/N0/SB0/P1} Subtest: E-Cache Initialization of first 1K
{/N0/SB0/P3} Subtest: E-Cache Initialization
{/N0/SB4/P0} Running Memory Configuration Tests
{/N0/SB4/P2} Subtest: DMMU Initialization
{/N0/SB4/P3} Subtest: DMMU Initialization
{/N0/SB0/P0} Subtest: E-Cache Initialization
{/N0/SB4/P1} Running Memory Configuration Tests
{/N0/SB0/P1} Subtest: E-Cache Initialization
{/N0/SB4/P2} Subtest: Map LPOST to local space
{/N0/SB4/P3} Subtest: Map LPOST to local space
{/N0/SB0/P2} Subtest: Disable Memory Controllers
{/N0/SB0/P3} Subtest: Disable Memory Controllers
{/N0/SB4/P0} Subtest: IMMU Initialization
{/N0/SB4/P1} Subtest: IMMU Initialization
{/N0/SB4/P0} Subtest: DMMU Initialization
{/N0/SB4/P1} Subtest: DMMU Initialization
{/N0/SB0/P0} Subtest: Disable Memory Controllers
{/N0/SB0/P1} Subtest: Disable Memory Controllers
{/N0/SB4/P2} Subtest: E-Cache Initialization of first 1K
{/N0/SB0/P2} Subtest: Memory Controller Configuration
{/N0/SB0/P3} Subtest: Memory Controller Configuration
{/N0/SB4/P0} Subtest: Map LPOST to local space
{/N0/SB4/P3} Subtest: E-Cache Initialization of first 1K
{/N0/SB0/P0} Subtest: Memory Controller Configuration
{/N0/SB4/P1} Subtest: Map LPOST to local space
{/N0/SB0/P1} Subtest: Memory Controller Configuration
{/N0/SB0/P2} Subtest: Memory DIMMs Init
{/N0/SB0/P3} Subtest: Memory DIMMs Init
{/N0/SB4/P0} Subtest: E-Cache Initialization of first 1K
{/N0/SB4/P2} Subtest: E-Cache Initialization
{/N0/SB4/P1} Subtest: E-Cache Initialization of first 1K
{/N0/SB4/P3} Subtest: E-Cache Initialization
{/N0/SB0/P0} Subtest: Memory DIMMs Init
{/N0/SB0/P2} Subtest: UP Memory Clear
{/N0/SB0/P1} Subtest: Memory DIMMs Init
{/N0/SB0/P3} Subtest: UP Memory Clear
{/N0/SB4/P0} Subtest: E-Cache Initialization
{/N0/SB4/P1} Subtest: E-Cache Initialization
{/N0/SB4/P2} Subtest: Disable Memory Controllers
{/N0/SB4/P3} Subtest: Disable Memory Controllers
{/N0/SB4/P0} Subtest: Disable Memory Controllers
{/N0/SB4/P1} Subtest: Disable Memory Controllers
{/N0/SB0/P0} Subtest: UP Memory Clear
{/N0/SB0/P1} Subtest: UP Memory Clear
{/N0/SB4/P2} Subtest: Memory Controller Configuration
{/N0/SB4/P3} Subtest: Memory Controller Configuration
{/N0/SB4/P0} Subtest: Memory Controller Configuration
{/N0/SB4/P1} Subtest: Memory Controller Configuration
{/N0/SB4/P2} Subtest: Memory DIMMs Init
{/N0/SB4/P3} Subtest: Memory DIMMs Init
{/N0/SB4/P0} Subtest: Memory DIMMs Init
{/N0/SB4/P1} Subtest: Memory DIMMs Init
{/N0/SB4/P0} Subtest: UP Memory Clear
{/N0/SB4/P1} Subtest: UP Memory Clear
{/N0/SB4/P2} Subtest: UP Memory Clear
{/N0/SB4/P3} Subtest: UP Memory Clear
{/N0/SB0/P2} Running Memory Tests
{/N0/SB0/P0} Running Memory Tests
{/N0/SB0/P3} Running Memory Tests
{/N0/SB0/P1} Running Memory Tests
{/N0/SB0/P2} Subtest: Enable Correctable Error Traps
{/N0/SB0/P0} Subtest: Enable Correctable Error Traps
{/N0/SB0/P3} Subtest: Enable Correctable Error Traps
{/N0/SB0/P1} Subtest: Enable Correctable Error Traps
{/N0/SB4/P0} Running Memory Tests
{/N0/SB4/P2} Running Memory Tests
{/N0/SB4/P1} Running Memory Tests
{/N0/SB4/P3} Running Memory Tests
{/N0/SB4/P0} Subtest: Enable Correctable Error Traps
{/N0/SB4/P2} Subtest: Enable Correctable Error Traps
{/N0/SB4/P1} Subtest: Enable Correctable Error Traps
{/N0/SB4/P3} Subtest: Enable Correctable Error Traps
{/N0/SB0/P0} Running Advanced CPU Tests
{/N0/SB0/P2} Running Advanced CPU Tests
{/N0/SB0/P1} Running Advanced CPU Tests
{/N0/SB0/P3} Running Advanced CPU Tests
{/N0/SB0/P0} Running CPU ECC Tests
{/N0/SB0/P1} Running CPU ECC Tests
{/N0/SB0/P2} Running CPU ECC Tests
{/N0/SB0/P3} Running CPU ECC Tests
{/N0/SB4/P0} Running Advanced CPU Tests
{/N0/SB4/P2} Running Advanced CPU Tests
{/N0/SB4/P1} Running Advanced CPU Tests
{/N0/SB4/P3} Running Advanced CPU Tests
{/N0/SB0/P0} Running System Level Tests
{/N0/SB0/P1} Running System Level Tests
{/N0/SB0/P0} Subtest: Invalidate Caches
{/N0/SB0/P2} Running System Level Tests
{/N0/SB0/P3} Running System Level Tests
{/N0/SB0/P1} Subtest: Invalidate Caches
{/N0/SB4/P0} Running CPU ECC Tests
{/N0/SB4/P2} Running CPU ECC Tests
{/N0/SB4/P3} Running CPU ECC Tests
{/N0/SB4/P1} Running CPU ECC Tests
{/N0/SB0/P2} Subtest: Invalidate Caches
{/N0/SB0/P3} Subtest: Invalidate Caches
{/N0/SB0/P2} Running Board Memory Interleave
{/N0/SB0/P0} Running Board Memory Interleave
{/N0/SB0/P3} Running Board Memory Interleave
{/N0/SB0/P1} Running Board Memory Interleave
{/N0/SB0/P2} Subtest: Board Memory Interleave Configuration
{/N0/SB0/P0} Subtest: Board Memory Interleave Configuration
{/N0/SB0/P3} Subtest: Board Memory Interleave Configuration
{/N0/SB0/P1} Subtest: Board Memory Interleave Configuration
{/N0/SB0/P0} Passed
{/N0/SB0/P1} Passed
{/N0/SB0/P2} Passed
{/N0/SB0/P3} Passed
{/N0/SB4/P0} Running System Level Tests
{/N0/SB4/P1} Running System Level Tests
{/N0/SB4/P0} Subtest: Invalidate Caches
{/N0/SB4/P1} Subtest: Invalidate Caches
{/N0/SB4/P2} Running System Level Tests
{/N0/SB4/P3} Running System Level Tests
{/N0/SB4/P2} Subtest: Invalidate Caches
{/N0/SB4/P3} Subtest: Invalidate Caches
{/N0/SB4/P2} Running Board Memory Interleave
{/N0/SB4/P3} Running Board Memory Interleave
{/N0/SB4/P2} Subtest: Board Memory Interleave Configuration
{/N0/SB4/P3} Subtest: Board Memory Interleave Configuration
{/N0/SB4/P0} Running Board Memory Interleave
{/N0/SB4/P1} Running Board Memory Interleave
{/N0/SB4/P0} Subtest: Board Memory Interleave Configuration
{/N0/SB4/P1} Subtest: Board Memory Interleave Configuration
{/N0/SB4/P0} Passed
{/N0/SB4/P1} Passed
{/N0/SB4/P2} Passed
{/N0/SB4/P3} Passed
Testing IO Boards ...
Jul 22 16:12:46 sdwg11 Domain-A.SC: Paused due to an error
setkeyswitch operation did not complete
keyswitch is: standby
sdwg11:A>
sdwg11:A> showb  

Slot     Pwr Component Type                 State      Status     Domain
----     --- --------------                 -----      ------     ------
/N0/SB0  On  CPU Board V2                   Assigned   Passed     A
/N0/SB4  On  CPU Board V2                   Assigned   Passed     A
/N0/IB6  On  PCI I/O Board                  Assigned   Not tested A
/N0/IB8  On  PCI I/O Board                  Assigned   Not tested A


这是时候SB2的只有下面的拆卸灯亮,其余灭。
sdwg11:A> showlog

Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 285230 local0.error]
>>> SafariPortError2[0x220] : 0x00028002
            AccParSglErr [17:17] : 0x1
               ParSglErr [01:01] : 0x1 ParitySingle error
                      FE [15:15] : 0x1

Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 394940 local0.error]
    SafariPortError3[0x230] : 0x00000002
               ParSglErr [01:01] : 0x1 ParitySingle error

Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 640063 local0.error]
Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 194459 local0.error] [AD] Event: SF4800
     CSN: 0336HH279E DomainID: A ADInfo: 1.SCAPP.20.6
     Time: Wed Jul 22 16:04:14 GMT+08:00 2015
     FRU-List-Count: 0; FRU-PN:  ; FRU-SN:  ; FRU-LOC: UNRESOLVED
     Recommended-Action: Service action required

Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 490325 local0.error] /N0/SB4 detected an error: error register 0x0003; Service action recommended
Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 175524 local0.error] ArAsic reported first error on /N0/SB4
Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 294103 local0.error]
/partition0/domain0/SB4/ar0:
>>> L2CheckError[0x6150] : 0x00009e1e
             CMDVSyncErr [12:09] : 0xf Ports [9:6] command valid mismatched against internal expected command valid
             PreqSyncErr [04:01] : 0xf Ports [9:6] prereq mismatched against internal expected prereq
                      FE [15:15] : 0x1

Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 464379 local0.error]
/partition0/domain0/SB4/sdc0:
>>> SafariPortError0[0x200] : 0x00028002
            AccParSglErr [17:17] : 0x1
               ParSglErr [01:01] : 0x1 ParitySingle error
                      FE [15:15] : 0x1

Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 153134 local0.error]
>>> SafariPortError1[0x210] : 0x00028002
            AccParSglErr [17:17] : 0x1
               ParSglErr [01:01] : 0x1 ParitySingle error
                      FE [15:15] : 0x1

Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 285230 local0.error]
>>> SafariPortError2[0x220] : 0x00028002
            AccParSglErr [17:17] : 0x1
               ParSglErr [01:01] : 0x1 ParitySingle error
                      FE [15:15] : 0x1

Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 417326 local0.error]
>>> SafariPortError3[0x230] : 0x00028002
            AccParSglErr [17:17] : 0x1
               ParSglErr [01:01] : 0x1 ParitySingle error
                      FE [15:15] : 0x1

Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 640063 local0.error]
Jul 22 16:04:14 sdwg11 Domain-A.SC: [ID 194459 local0.error] [AD] Event: SF4800
     CSN: 0336HH279E DomainID: A ADInfo: 1.SCAPP.20.6
     Time: Wed Jul 22 16:04:14 GMT+08:00 2015
     FRU-List-Count: 0; FRU-PN:  ; FRU-SN:  ; FRU-LOC: UNRESOLVED
     Recommended-Action: Service action required

Jul 22 16:05:37 sdwg11 Domain-A.SC: [ID 606348 local0.warning] Excluded unusable, unlicensed, failed or disabled board: /N0/SB0
Jul 22 16:05:37 sdwg11 Domain-A.SC: [ID 606352 local0.warning] Excluded unusable, unlicensed, failed or disabled board: /N0/SB4
Jul 22 16:05:42 sdwg11 Domain-A.SC: [ID 849357 local0.error] DX Interconnect test: System board SB2/dx0 Dx-AR  pause line connection to system board(s)  RP2 failed
Jul 22 16:05:42 sdwg11 Domain-A.SC: [ID 794051 local0.error] SB2/dx0 Bit in error Global_Oring_Out_B [8]  
Jul 22 16:05:42 sdwg11 Domain-A.SC: [ID 142848 local0.error] DX Interconnect test: System board SB2/dx0 Dx-AR  pause line connection to system board(s)  RP0 failed
Jul 22 16:05:42 sdwg11 Domain-A.SC: [ID 531907 local0.error] SB2/dx0 Bit in error Global_Oring_Out_B [6]  
Jul 22 16:05:51 sdwg11 Domain-A.SC: [ID 967368 local0.error] CPU Board V2 at /N0/SB2 has been removed from domain A due to a failure in interconnection test. Service action required.
Jul 22 16:06:00 sdwg11 Domain-A.SC: [ID 983141 local0.error] Paused due to an error
Jul 22 16:09:29 sdwg11 Domain-A.SC: [ID 136975 local0.notice] Domain Shell - A: setkey on: Initiating keyswitch: on, domain A.
Jul 22 16:12:46 sdwg11 Domain-A.SC: [ID 983141 local0.error] Paused due to an error



我试过把SB2拔出再起也不行。
poweroff all再poweron 也不行
都是到了IO板自检就挂了。
{/N0/SB4/P3} Passed
Testing IO Boards ...
Jul 22 16:20:28 sdwg11 Domain-A.SC: Paused due to an error
setkeyswitch operation did not complete
keyswitch is: standby


后来把板子插进去
Jul 22 16:53:19 sdwg11 Platform.SC: SB2, hotplug status, SB2, module inserted (9,17)
然后poweron all
sdwg11:SC> poweron all

SB2板还是报错
Jul 22 16:56:09 sdwg11 Platform.SC: Device voltage problem: SB2 abnormal state for device: Board 0 3.3 VDC 0 Value: 0.67 Volts DC
heJul 22 16:56:09 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: (SdcAsic)Asic.getTemp: Path broken between CBH and SDC: SB2.sdc.10 (12400010)
lpJul 22 16:56:09 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:09 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: (ArAsic)Asic.getTemp: Path broken between CBH and SDC: SB2.ar.10 (12480010)
Jul 22 16:56:09 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:09 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: /SB2/dx0: DxAsic.getTemp: sun.serengeti.jtag.JtagException: JtagController.tapWait:  Path broken between CBH and SDC: SB2.sdc.b0 (124000b0)
Jul 22 16:56:09 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:09 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: /SB2/dx1: DxAsic.getTemp: sun.serengeti.jtag.JtagException: JtagController.tapWait:  Path broken between CBH and SDC: SB2.sdc.b0 (124000b0)
Jul 22 16:56:09 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:09 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: /SB2/dx2: DxAsic.getTemp: sun.serengeti.jtag.JtagException: JtagController.tapWait:  Path broken between CBH and SDC: SB2.sdc.b0 (124000b0)
Jul 22 16:56:09 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:09 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: /SB2/dx3: DxAsic.getTemp: sun.serengeti.jtag.JtagException: JtagController.tapWait:  Path broken between CBH and SDC: SB2.sdc.b0 (124000b0)
Jul 22 16:56:09 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:09 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: (RepeaterSbbcAsic)Asic.getTemp: Path broken between CBH and SDC: SB2.sbbc0.regs.10 (10800010)
Jul 22 16:56:09 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:09 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc0.regs.c0 (108000c0)
Jul 22 16:56:09 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:09 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc0.regs.c0 (108000c0)
Jul 22 16:56:09 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:09 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc0.regs.c0 (108000c0)
Jul 22 16:56:09 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:10 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.HpuFailedException: CpuVoltageA2D.getOutputVoltage: sun.serengeti.CommException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc0.regs.c0 (108000c0)
Jul 22 16:56:10 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:10 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc0.regs.c0 (108000c0)
Jul 22 16:56:10 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:10 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.HpuFailedException: CpuVoltageA2D.getOutputVoltage: sun.serengeti.CommException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc0.regs.c0 (108000c0)
Jul 22 16:56:10 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:10 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: (RepeaterSbbcAsic)Asic.getTemp: Path broken between CBH and SDC: SB2.sbbc1.regs.10 (10a00010)
Jul 22 16:56:10 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:10 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc1.regs.c0 (10a000c0)
Jul 22 16:56:10 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:10 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc1.regs.c0 (10a000c0)
Jul 22 16:56:10 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:10 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc1.regs.c0 (10a000c0)
Jul 22 16:56:10 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:10 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.HpuFailedException: CpuVoltageA2D.getOutputVoltage: sun.serengeti.CommException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc1.regs.c0 (10a000c0)
Jul 22 16:56:10 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:10 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.FailedHwException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc1.regs.c0 (10a000c0)
Jul 22 16:56:10 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:11 sdwg11 Platform.SC: CPU Board V2 at SB2 Device poll caused: sun.serengeti.HpuFailedException: CpuVoltageA2D.getOutputVoltage: sun.serengeti.CommException: I2cComm.readCmd:  Path broken between CBH and SDC: SB2.sbbc1.regs.c0 (10a000c0)
Jul 22 16:56:11 sdwg11 Platform.SC: Device will not be polled
Jul 22 16:56:11 sdwg11 Platform.SC: SB2, sensor status, outside acceptable limits (7,1,0x204020d00070000)


但是这次sdwg11:A> setkey on
Powering boards on ..
系统竟然起来了。




有几点问题请大神帮忙解惑下:
1、ctrl+A进入的->提示符是啥界面啊?从这怎么进入SC?
2、为啥SB2poweoff就无法启动啊?

Solution  1006136.1 :   Sun Fire[TM] V1280/E2900/3800/4800/4810/6800/E4900/E6900 & Netra 1280/1290: How to add CPUs and System Boards to a running domain
根据这个帖子来换CPU板可以吗?





论坛徽章:
1
操作系统版块每日发帖之星
日期:2015-07-30 09:40:01
2 [报告]
发表于 2015-07-22 23:36 |只看该作者
本帖最后由 sherwinzhang 于 2015-07-23 17:57 编辑

另外请教下,showfru的用法,我看手册上是
hostname cli> showfru fan 1 Sun_Part_No
hostname cli> showfru slot 8 Sun_Part_No
但是设备提示是
sdwg11:SC> showfru
showfru
Usage: showfru [-v] -r <record>
       showfru -h

Sun Fire Midrange Systems Platform Administration Manual上的用法是
showfru -r manr
Displays the manufacturing records of FRUs installed in a Sun Fire midrange system.


sdwg11:SC> showfru -r ma
showfru -r ma

Component        Part #         Serial Date       Time               Vend
---------        ------         ------ ----       ----               ----
SSC0             501-5407-13-61 030152 04/17/2003 18:00:27/GMT+08:00 012c
ID0              501-4406-06-50 903193 02/05/2003 06:10:28/GMT+08:00 03c1
PS0              300-1460-04-50 W14810 11/12/2003 18:51:33/GMT+08:00 026d
PS1              300-1460-04-50 C22148 02/11/2004 12:14:39/GMT+08:00 031a
PS2              300-1460-04-50 W12124 09/05/2003 06:30:25/GMT+08:00 026d
FT0              540-4345-01-52 WK171S 07/09/2003 22:58:31/GMT+08:00 021c

自己解决了,可以看到fru的所有信息。

论坛徽章:
20
申猴
日期:2013-09-12 19:39:05狮子座
日期:2014-07-20 21:19:51寅虎
日期:2014-08-16 18:37:47水瓶座
日期:2014-10-15 18:58:25天蝎座
日期:2015-01-22 18:19:15NBA常规赛纪念章
日期:2015-05-04 22:32:032015亚冠之胡齐斯坦钢铁
日期:2015-06-03 11:28:502015亚冠之吉达阿赫利
日期:2015-09-19 12:41:47午马
日期:2013-09-18 14:36:40戌狗
日期:2013-09-18 14:44:39处女座
日期:2013-09-24 17:46:41CU十二周年纪念徽章
日期:2013-10-24 15:41:34
3 [报告]
发表于 2015-07-23 14:05 |只看该作者
SB2被poweroff后,应该用deleteboard把SB2剔出,然后poweron all

论坛徽章:
1
操作系统版块每日发帖之星
日期:2015-07-30 09:40:01
4 [报告]
发表于 2015-07-23 17:55 |只看该作者
znnnz 发表于 2015-07-23 14:05
SB2被poweroff后,应该用deleteboard把SB2剔出,然后poweron all


sdwg11:A> deleteboar SB2

我当时把SB2踢出了,但是没有poweron它,这会影响启动?为啥呢?

论坛徽章:
20
申猴
日期:2013-09-12 19:39:05狮子座
日期:2014-07-20 21:19:51寅虎
日期:2014-08-16 18:37:47水瓶座
日期:2014-10-15 18:58:25天蝎座
日期:2015-01-22 18:19:15NBA常规赛纪念章
日期:2015-05-04 22:32:032015亚冠之胡齐斯坦钢铁
日期:2015-06-03 11:28:502015亚冠之吉达阿赫利
日期:2015-09-19 12:41:47午马
日期:2013-09-18 14:36:40戌狗
日期:2013-09-18 14:44:39处女座
日期:2013-09-24 17:46:41CU十二周年纪念徽章
日期:2013-10-24 15:41:34
5 [报告]
发表于 2015-07-23 21:02 |只看该作者
回复 4# sherwinzhang


    你poweroff sb2  ,  deleteboard sb2 后应该退出domain,poweroff all ,重新poweron  all,  setkeyswitch  on。

论坛徽章:
20
申猴
日期:2013-09-12 19:39:05狮子座
日期:2014-07-20 21:19:51寅虎
日期:2014-08-16 18:37:47水瓶座
日期:2014-10-15 18:58:25天蝎座
日期:2015-01-22 18:19:15NBA常规赛纪念章
日期:2015-05-04 22:32:032015亚冠之胡齐斯坦钢铁
日期:2015-06-03 11:28:502015亚冠之吉达阿赫利
日期:2015-09-19 12:41:47午马
日期:2013-09-18 14:36:40戌狗
日期:2013-09-18 14:44:39处女座
日期:2013-09-24 17:46:41CU十二周年纪念徽章
日期:2013-10-24 15:41:34
6 [报告]
发表于 2015-07-23 21:09 |只看该作者
回复 4# sherwinzhang


    最好showchs -b看看

论坛徽章:
1
操作系统版块每日发帖之星
日期:2015-07-30 09:40:01
7 [报告]
发表于 2015-07-24 10:15 |只看该作者
znnnz 发表于 2015-07-23 21:02
回复 4# sherwinzhang

你poweroff sb2  ,  deleteboard sb2 后应该退出domain,poweroff all ,重新poweron  all,  setkeyswitch  on。



当时我的想法是只把sb2踢出,因此当时系统pasued,想直接恢复。结果最后还是重启。
当poweroff sb2后
sdwg11:A>showdo
Domain    Solaris Nodename    Domain Status            Keyswitch      
--------  ------------------  -----------------------  -------------  
A         sdwg11              Paused due to an error   on

diag-level = init
post-tolerate-ce = false
mpr-support-enable = true
verbosity-level = min
error-level = max
interleave-scope = within-board
interleave-mode = optimal
reboot-on-error = true
hang-policy = reset
log-reset-data = true
verbose-reset-data = true
reset-data-ftp-url =
max-panic-diag-limit = mem2
OBP.use-nvramrc? = true
OBP.auto-boot? = true
OBP.error-reset-recovery = <OBP default>

Loghost for Domain A:
Log Facility for Domain A: local0

SNMP Agent: disabled
Domain Description:  
Domain Contact:  

ACL for Domain A: SB0 SB2 SB4 IB6 IB8

PROC RTUs reserved for domain A: 0

论坛徽章:
1
操作系统版块每日发帖之星
日期:2015-07-30 09:40:01
8 [报告]
发表于 2015-07-24 16:12 |只看该作者
The Sun Fire 6800 system is designed to greatly improve reliability, serviceability,
and availability (RAS) over previous generations of systems. The Sun Fire system is
designed to be able to recover from any hardware failure. Some failure recovery will
not impact users (for example, a power supply failure) if the system is configured for
redundant power supplies. Some failure recovery (for example, a CPU failure) will
require a reboot, and will impact users, but a properly configured system will
always be able to recover from any hardware failure.

CPU板宕了系统必然要宕机啊。
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP