sun E2900 主板问题
各位中午好:机器是sun E2900的,操作系统Solaris8的。三个主板插槽sb4、sb2、sb0.其中sb4、sb2上插有两块主板。sb2报警。
prtdiag -v如下:
System Configuration: Sun Microsystemssun4u Sun Fire E2900
System clock frequency: 150 MHZ
Memory size: 28GB
======================================= CPUs =======================================
E$ CPU CPU Temperature Fan
CPU Freq Size Impl. Mask Die Ambient Speed Unit
------------------------------------------------------- ----
SB2/P01200 MHz16MB US-IV 2.3 80 C 36 C
SB2/P11200 MHz16MB US-IV 2.3 78 C 38 C
SB2/P21200 MHz16MB US-IV 2.3 78 C 34 C
SB4/P01200 MHz16MB US-IV 2.3 69 C 36 C
SB4/P11200 MHz16MB US-IV 2.3 82 C 41 C
SB4/P21200 MHz16MB US-IV 2.3 82 C 40 C
SB4/P31200 MHz16MB US-IV 2.3 75 C 39 C
=============================== Memory Configuration ===============================
Segment Table:
-----------------------------------------------------------------------
Base Address Size Interleave FactorContains
-----------------------------------------------------------------------
0x0 8GB 8 BankIDs 32,33,34,36,37,38,40,41
0x200000000 4GB 4 BankIDs 35,39,42,43
0x2000000000 16GB 16 BankIDs 64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79
Bank Table:
-----------------------------------------------------------
Physical Location
ID ControllerIDGroupID Size Interleave Way
-----------------------------------------------------------
32 8 0 1GB 0
33 8 1 1GB 3
34 8 0 1GB 6
36 9 0 1GB 1
37 9 1 1GB 4
38 9 0 1GB 7
40 10 0 1GB 2
41 10 1 1GB 5
35 8 1 1GB 0
39 9 1 1GB 1
42 10 0 1GB 2
43 10 1 1GB 3
64 16 0 1GB 0
65 16 1 1GB 4
66 16 0 1GB 8
67 16 1 1GB 12
68 17 0 1GB 1
69 17 1 1GB 5
70 17 0 1GB 9
71 17 1 1GB 13
72 18 0 1GB 2
73 18 1 1GB 6
74 18 0 1GB 10
75 18 1 1GB 14
76 19 0 1GB 3
77 19 1 1GB 7
78 19 0 1GB 11
79 19 1 1GB 15
Memory Module Groups:
--------------------------------------------------
ControllerID GroupIDLabels
--------------------------------------------------
8 0 SB2/P0/B0/D0,SB2/P0/B0/D1,SB2/P0/B0/D2,SB2/P0/B0/D3
8 1 SB2/P0/B1/D0,SB2/P0/B1/D1,SB2/P0/B1/D2,SB2/P0/B1/D3
Memory Module Groups:
--------------------------------------------------
ControllerID GroupIDLabels
--------------------------------------------------
9 0 SB2/P1/B0/D0,SB2/P1/B0/D1,SB2/P1/B0/D2,SB2/P1/B0/D3
9 1 SB2/P1/B1/D0,SB2/P1/B1/D1,SB2/P1/B1/D2,SB2/P1/B1/D3
Memory Module Groups:
--------------------------------------------------
ControllerID GroupIDLabels
--------------------------------------------------
10 0 SB2/P2/B0/D0,SB2/P2/B0/D1,SB2/P2/B0/D2,SB2/P2/B0/D3
10 1 SB2/P2/B1/D0,SB2/P2/B1/D1,SB2/P2/B1/D2,SB2/P2/B1/D3
Memory Module Groups:
--------------------------------------------------
ControllerID GroupIDLabels
--------------------------------------------------
16 0 SB4/P0/B0/D0,SB4/P0/B0/D1,SB4/P0/B0/D2,SB4/P0/B0/D3
16 1 SB4/P0/B1/D0,SB4/P0/B1/D1,SB4/P0/B1/D2,SB4/P0/B1/D3
Memory Module Groups:
--------------------------------------------------
ControllerID GroupIDLabels
--------------------------------------------------
17 0 SB4/P1/B0/D0,SB4/P1/B0/D1,SB4/P1/B0/D2,SB4/P1/B0/D3
17 1 SB4/P1/B1/D0,SB4/P1/B1/D1,SB4/P1/B1/D2,SB4/P1/B1/D3
Memory Module Groups:
--------------------------------------------------
ControllerID GroupIDLabels
--------------------------------------------------
18 0 SB4/P2/B0/D0,SB4/P2/B0/D1,SB4/P2/B0/D2,SB4/P2/B0/D3
18 1 SB4/P2/B1/D0,SB4/P2/B1/D1,SB4/P2/B1/D2,SB4/P2/B1/D3
Memory Module Groups:
--------------------------------------------------
ControllerID GroupIDLabels
--------------------------------------------------
19 0 SB4/P3/B0/D0,SB4/P3/B0/D1,SB4/P3/B0/D2,SB4/P3/B0/D3
19 1 SB4/P3/B1/D0,SB4/P3/B1/D1,SB4/P3/B1/D2,SB4/P3/B1/D3
=============================== Environmental Status ===============================
Fan Speeds:
-----------------------------------------
Location Sensor Status Speed
-----------------------------------------
FT0/FAN3 ft_fan3 self-regulating
FT0/FAN0 ft_fan0 self-regulating
FT0/FAN1 ft_fan1 self-regulating
FT0/FAN2 ft_fan2 self-regulating
FT0/FAN4 ft_fan4 self-regulating
FT0/FAN5 ft_fan5 self-regulating
FT0/FAN6 ft_fan6 self-regulating
FT0/FAN7 ft_fan7 self-regulating
IB6/FAN0 ft_fan0 okay 100%
IB6/FAN1 ft_fan1 okay 100%
--------------------------------------------------
======== FRU Status =========
-------------------------
Fru Operational Status:
-------------------------
Location Status
-------------------------
PS0 okay
PS1 okay
PS2 okay
PS3 okay
FT0 okay
FT0/FAN3 okay
FT0/FAN0 okay
FT0/FAN1 okay
FT0/FAN2 okay
FT0/FAN4 okay
FT0/FAN5 okay
FT0/FAN6 okay
FT0/FAN7 okay
RP0 okay
RP2 okay
SB2 ok
SB2/P0 online
SB2/P0/B0/D0 okay
SB2/P0/B0/D1 okay
SB2/P0/B0/D2 okay
SB2/P0/B0/D3 okay
SB2/P0/B1/D0 okay
SB2/P0/B1/D1 okay
SB2/P0/B1/D2 okay
SB2/P0/B1/D3 okay
SB2/P1 online
SB2/P1/B0/D0 okay
SB2/P1/B0/D1 okay
SB2/P1/B0/D2 okay
SB2/P1/B0/D3 okay
SB2/P1/B1/D0 okay
SB2/P1/B1/D1 okay
SB2/P1/B1/D2 okay
SB2/P1/B1/D3 okay
SB2/P2 online
SB2/P2/B0/D0 okay
SB2/P2/B0/D1 okay
SB2/P2/B0/D2 okay
SB2/P2/B0/D3 okay
SB2/P2/B1/D0 okay
SB2/P2/B1/D1 okay
SB2/P2/B1/D2 okay
SB2/P2/B1/D3 okay
SB2/P3 disabled
SB2/P3/B0/D0 disabled
SB2/P3/B0/D1 disabled
SB2/P3/B0/D2 disabled
SB2/P3/B0/D3 disabled
SB2/P3/B1/D0 disabled
SB2/P3/B1/D1 disabled
SB2/P3/B1/D2 disabled
SB2/P3/B1/D3 disabled
SB4 ok
SB4/P0 online
SB4/P0/B0/D0 okay
SB4/P0/B0/D1 okay
SB4/P0/B0/D2 okay
SB4/P0/B0/D3 okay
SB4/P0/B1/D0 okay
SB4/P0/B1/D1 okay
SB4/P0/B1/D2 okay
SB4/P0/B1/D3 okay
SB4/P1 online
SB4/P1/B0/D0 okay
SB4/P1/B0/D1 okay
SB4/P1/B0/D2 okay
SB4/P1/B0/D3 okay
SB4/P1/B1/D0 okay
SB4/P1/B1/D1 okay
SB4/P1/B1/D2 okay
SB4/P1/B1/D3 okay
SB4/P2 online
SB4/P2/B0/D0 okay
SB4/P2/B0/D1 okay
SB4/P2/B0/D2 okay
SB4/P2/B0/D3 okay
SB4/P2/B1/D0 okay
SB4/P2/B1/D1 okay
SB4/P2/B1/D2 okay
SB4/P2/B1/D3 okay
SB4/P3 online
SB4/P3/B0/D0 okay
SB4/P3/B0/D1 okay
SB4/P3/B0/D2 okay
SB4/P3/B0/D3 okay
SB4/P3/B1/D0 okay
SB4/P3/B1/D1 okay
SB4/P3/B1/D2 okay
SB4/P3/B1/D3 okay
IB6 ok
IB6/FAN0 okay
IB6/FAN1 okay
就是少了p3这个cpu和对应的4G内存
后来新到了块板子,共计3块主板,开始试验,具体过程不说了,总结如下:
每块主板插在sb4上都能够识别到所有的cpu和内存;每块主板插在sb2槽位上全部如上,sb2/p3这颗cpu和对应的内存找不到;每块主板插在sb0槽位上如下:
System Configuration: Sun Microsystemssun4u Sun Fire E2900
System clock frequency: 150 MHZ
Memory size: 28GB
======================================= CPUs =======================================
E$ CPU CPU Temperature Fan
CPU Freq Size Impl. Mask Die Ambient Speed Unit
------------------------------------------------------- ----
SB0/P11200 MHz16MB US-IV 2.4 98 C 38 C
SB0/P21200 MHz16MB US-IV 2.4 95 C 38 C
SB0/P31200 MHz16MB US-IV 2.4 91 C 39 C
SB4/P01200 MHz16MB US-IV 2.3 74 C 40 C
SB4/P11200 MHz16MB US-IV 2.3 89 C 45 C
SB4/P21200 MHz16MB US-IV 2.3 89 C 44 C
SB4/P31200 MHz16MB US-IV 2.3 81 C 42 C
这个槽位上就是p0这个cpu和对应内存找不见了。
附上开机启动时我认为是问题关键所在的信息吧:
Powering boards on ...
Fri Jul 13 00:23:01 noname.example.com lom: /N0/FT0, fan speed, Low (4,1)
Fri Jul 13 00:23:41 noname.example.com lom: Agent {/N0/SB2/P3/C0}is disabled.
Fri Jul 13 00:23:42 noname.example.com lom: Agent {/N0/SB2/P3/C1}is disabled.
Fri Jul 13 00:23:42 noname.example.com lom: Port {/N0/SB2/P3}is disabled.
Testing CPU Boards ...
Loading the test table from board SB2 PROM 0 ...
Loading the test table from board SB4 PROM 0 ...
Fri Jul 13 00:25:27 noname.example.com lom: Agent {/N0/SB2/P3/C0}is disabled.
Fri Jul 13 00:25:27 noname.example.com lom: Agent {/N0/SB2/P3/C1}is disabled.
Fri Jul 13 00:25:27 noname.example.com lom: Port {/N0/SB2/P3}is disabled.
{/N0/SB2/P3/C0} @(#) lpost 5.20.52007/02/07 13:54
{/N0/SB2/P3/C0} Copyright 2007 Sun Microsystems, Inc.All rights reserved.
{/N0/SB2/P3/C0} Use is subject to license terms.
{/N0/SB2/P3/C0} Subtest: Setting Fireplane Config Registers for aid 0xb
{/N0/SB2/P3/C0} Subtest: Display CPU Version, frequency
{/N0/SB2/P3/C0} Version register = 003e0018.23000507
{/N0/SB2/P3/C0} CPU features = 0000225f.045205ff
{/N0/SB2/P3/C0} Ecache Control Register 00000000.01c55000
{/N0/SB2/P3/C0} Cpu/System ratio = 8, cpu actual frequency = 1200
{/N0/SB2/P3/C0} @(#) lpost 5.20.52007/02/07 13:54
{/N0/SB2/P3/C0} Copyright 2007 Sun Microsystems, Inc.All rights reserved.
{/N0/SB2/P3/C0} Use is subject to license terms.
{/N0/SB2/P0/C0} Passed
{/N0/SB2/P1/C0} Passed
{/N0/SB2/P0/C1} Passed
{/N0/SB2/P1/C1} Passed
{/N0/SB2/P2/C0} Passed
{/N0/SB2/P3/C0} Disabled
{/N0/SB2/P2/C1} Passed
{/N0/SB2/P3/C1} Disabled
怎么样各位大神,应该对我的问题了解清楚了吧,还需要什么其他的信息吗。希望各位给予小弟一些建议措施什么的。不胜感激
被Disabled吧,showchs-b看看 哥们,你说的这个口令是在哪输的啊,lom>下我没见到这个口令啊。
倒是有这些口令
lom>showboards
Slot Pwr Component Type State Status
---- --- -------------- ----- ------
SSC1 OnSystem Controller V2 Main Passed
/N0/SCC- System Config Card Assigned OK
/N0/BP - Baseplane Assigned Passed
/N0/SIB- Indicator Board Assigned Passed
/N0/SPDB - System Power Distribution Bd.Assigned Passed
/N0/PS0Off A166 Power Supply - OK
/N0/PS1Off A166 Power Supply - OK
/N0/PS2Off A166 Power Supply - OK
/N0/PS3Off A166 Power Supply - OK
/N0/FT0Off Fan Tray - Not tested
RP0 Off No Grid Power Assigned -
RP2 Off No Grid Power Assigned -
SB2 Off No Grid Power Assigned -
SB4 Off No Grid Power Assigned -
IB6 Off No Grid Power Assigned -
/N0/MB - Media Bay Assigned Not tested
lom>showfault
fault is off 回复 3# frzzdj
lom>showchs -b P3cpu被disable掉了
所以 相应的 内存也认不到
入手方式==判断到底是CPU/SB板子坏引起的disable 还是内存损坏 引起的 disable
假如是板子坏了 清CHS 是没用的
假如是内存换了 更换内存后清除CHS是有用的 还有 板子我便宜 要的话找我哈
前几天也卖了一个这个板子 不知道是不似乎在你哪了
先在lom下通过enable那个命令试试 根据我的实验,感觉应该不可能是板子或者内存坏了。除非是第一个板子的板子坏了能disable我后边插上去的板子cpu!!!!
还有goldfishway,你前几天出了一个板子啊,发到哪了,别说公司,说城市就行,说不定是我那呢!
另:disable和chs disable是一样的吗,我的从头到尾就没有chs 这一个东西。 处理结果:init 0下看到diag_level 是init (这个没见过,不应该是max 或 min吗)
ok power—off
lom enable 那个disabled的cpu
lom poweron
系统自检后进入系统,cpu和对应内存都找到了。
页:
[1]