rp7420多条内存状态为deconfig
rp7420一个cell板中18G内存状态为deconfig,syslog和event log都没有关于内存的报错信息,请问这18G内存都坏了吗,都需要更换?请版主指教PS输出如下:
HW status for Cell 0 : NO FAILURE DETECTED
Power status : on, no fault
Boot is not blocked
PDH memory is shared
Processor Compatibility : OK
RIO cable status : connected
RIO cable connection physical location : PCI Domain 0
Core cell is cell 1
Attention Led is off
PDHC status Leds :****
CPU Module Slot 0 1 2 3
Populated P P P P
Local 48V Good * * * *
Power Enabled * * * *
Power Good * * * *
(* - True, P - Processor, T - Terminator)
DIMMs populated:
0 . . . 4 . . . 8 . . .12 . . .
* * * * * * * * * * * *
1 1 1 1 1 1
VRM's 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
Present : * * * * * * * * * * * * * * *
Enabled : * * * * * * * * * * * * * * *
Pwr Good : * * * * * * * * * * * * * * *
Front Side Bus Freq. : 200 MHz
CPU Core Freq. : 1000 MHz
CPU Part Number : PA8900
System Boot Rom (SFW) firmware rev24.001
PDH controller (PDHC) firmware rev 3.030, built THU JUL 27 21:18:14 2006
MICE revision is 1.0
HW status for Cell 1 : NO FAILURE DETECTED
Power status : on, no fault
Boot is not blocked
PDH memory is shared
Processor Compatibility : OK
RIO cable status : connected
RIO cable connection physical location : PCI Domain 1
Core cell is cell 1
Attention Led is off
PDHC status Leds :***-
CPU Module Slot 0 1 2 3
Populated P P P P
Local 48V Good * * * *
Power Enabled * * * *
Power Good * * * *
(* - True, P - Processor, T - Terminator)
DIMMs populated:
0 . . . 4 . . . 8 . . .12 . . .
* * * * * * * * * * * *
1 1 1 1 1 1
VRM's 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
Present : * * * * * * * * * * * * * * *
Enabled : * * * * * * * * * * * * * * *
Pwr Good : * * * * * * * * * * * * * * *
Front Side Bus Freq. : 200 MHz
CPU Core Freq. : 1000 MHz
CPU Part Number : PA8900
System Boot Rom (SFW) firmware rev24.001
PDH controller (PDHC) firmware rev 3.030, built THU JUL 27 21:18:14 2006
MICE revision is 1.0
系统中cstm输出如下:
Memory Board Inventory
CAB/CELL: 0/0
DIMM A DIMM B
Slot Size (MB)Size (MB)
---- ------------------
0 2048 2048
1 2048 2048
2 2048 2048
3 2048 2048
4 1024 1024
5 1024 1024
6 0 0
7 0 0
Cell Total (MB): 20480
-------------------------------------------------
CAB/CELL: 0/1
DIMM A DIMM B
Slot Size (MB)Size (MB)
---- ------------------
0 0 0
1 0 0
2 0 0
3 0 0
4 1024 1024
5 0 0
6 0 0
7 0 0
Cell Total (MB): 2048
-------------------------------------------------
System Total (MB): 22528
系统中parstatus输入如下:
Note: No action specified. Default behavior is display all.
Complex Name : Complex 01
Complex Capacity
Compute Cabinet (2 cell capable) : 1
Active MP Location : cabinet 0
Original Product Name : server rp7420
Original Serial Number : SGH4723MJ2
Current Product Order Number : A7025A
OEM Manufacturer :
Complex Profile Revision : 1.0
The total number of partitions present : 1
Cabinet I/O Bulk PowerBackplane
Blowers Fans Supplies Power Boards
OK/ OK/ OK/ OK/
Cab Failed/ Failed/ Failed/ Failed/
Num Cabinet TypeN StatusN StatusN Status N Status MP
=== ===================== ========= ====================== ======
0 2 cell slot 4/0/N+ 6/0/N+ 2/0/N+ - Active
Notes: N+ = There are one or more spare items (fans/power supplies).
N= The number of items meets but does not exceed the need.
N- = There are insufficient items to meet the need.
?= The adequacy of the cooling system/power supplies is unknown.
HO = Housekeeping only; The power is in a standby state.
NA = Not Applicable.
CPU Memory Use
OK/ (GB) Core On
Hardware Actual Deconf/ OK/ Cell Next Par
Location Usage Max Deconf Connected To Capable Boot Num
========== ============ ======= ========= =================== ======= ==== ===
cab0,cell0 Active Base8/0/8 20.0/0.0cab0,bay0,chassis0no yes0
cab0,cell1 Active Core8/0/8 2.0/18.0cab0,bay0,chassis1yes yes0
Notes: * = Cell has no interleaved memory.
Core ConnectedPar
Hardware Location Usage IO To Num
=================== ============ ==== ========== ===
cab0,bay0,chassis0Active - cab0,cell0 0
cab0,bay0,chassis1Active yescab0,cell1 0
Par # of# of I/O
Num Status Cells ChassisCore cellPartition Name (first 30 chars)
=== ============ ===== ======== ========== ===============================
0 Active 2 2 cab0,cell1 Partition 0 回复 1# miranda616
检查MP里面的SEL看看有没有报过内存的报错(还有cstm看看有没有内存SBE或MBE之类的记录),没的话抽时间重启到BCH重新configure那些内存,看看能否让系统重新识别到~如果识别不到再看看是哪些报错(有可能是因为某根内存导致其他的一起被deconfigured的)。
回复 2# lbseraph 请问一下,重启到BCH重新configure那些内存,这个该如何操作,请问有没有文档?
miranda616 发表于 2013-12-12 20:22 static/image/common/back.gif
回复 2# lbseraph 请问一下,重启到BCH重新configure那些内存,这个该如何操作,请问有没有文档?
http://docstore.mik.ua/manuals/hp-ux/en/5991-1247B/ch07s09.html#aes-npar-274a 回复 4# lbseraph 感谢版主大人
回复 2# lbseraph 版主大人,今天检查了一下MP LOG中有几条内存报单比特错误或者PDT表FULL的错误,这些报错的内存都是deconfig状态中的,我重启到BCH中把所有deconfig状态的内存都置成config状态,并将PDT表清除,重启之后原来状态依旧,并且deconfig状态的内存又多了两条,是不是现在状态为deconfig状态的内存全部要换掉呢?
miranda616 发表于 2013-12-16 18:59 static/image/common/back.gif
回复 2# lbseraph 版主大人,今天检查了一下MP LOG中有几条内存报单比特错误或者PDT表FULL的错误,这些报错 ...
如果你清除PDT后还被deconfigure的话,那表明那些内存就是有问题的了,需要更换的~至少你可以尝试一下先更换报最多SBE的头几个内存条,如果还不行就把那些报错的内存都换了吧~ 这种情况 建议换吧
页:
[1]