Chinaunix

标题: 求助HP rx6600升级微码报错 [打印本页]

作者: tomorrowlm    时间: 2013-06-20 11:36
标题: 求助HP rx6600升级微码报错
HP rx6600 升级微码后报警,报警灯亮,可以正常启动操作系统,以下是报警日志:

Log Entry 36: 19 Jun 2013 23:19:03
Alert Level 7: Fatal
Keyword: MC_INITIATED
Machine Check initiated
Logged by: System Firmware  6
Data: Major change in system state - HPMC or MCA
0xF480009806E00320 000000000000000B
  
  
Log Entry 35: 19 Jun 2013 23:19:03
Alert Level 7: Fatal
Keyword: MACHINE_CHECK_INITIATED
Machine Check initiated
Logged by: Redundant w/ an E0 code;
Sensor: Critical Interrupt
Data2: OEM Code2: 0x00
0xC151C23C67020310 003FA17000130300
作者: haizdl    时间: 2013-06-20 19:04
〉〉Keyword: MACHINE_CHECK_INITIATED

发生MCA了,收集MCA日志,找惠普工具Decode。。。。。。。。
作者: lbseraph    时间: 2013-06-20 21:34
回复 1# tomorrowlm

MCA一般是硬件问题导致的。进EFI shell里面抓取MCA日志,或看下/var/tombstones目录下有没有相应的mca文件,收取到相应日志后找HP分析一下。
shell> errdump mca

PS: 一般你系统都起来了,而又没有再次出现类似MCA的话,基本没什么问题,可以继续观察。
   
作者: NEDK    时间: 2013-06-26 21:17
看看是不是每次都报MC吧,不一定有问题,如果每次都报的话,肯定是硬件问题
作者: lbseraph    时间: 2013-06-28 23:35
嗯,还有如果正常能进到系统后消掉告警灯后再观察吧,如果真是硬件问题导致,那么还会出现MCA的报错,那时候就可以抓日志和MCA来看了。
作者: cjhvslhb    时间: 2013-06-30 18:17
IPMI Event Code: F480009806E00320 000000000000000B

Record Type         = E0h
Reporting Entity ID = System Firmware - cpu 6
Event ID            = #152

...........................................................

Keyword             = MC_INITIATED   

Description:

A Machine Check has been initiated

Cause / Action:

A Machine Check has occurred.

Recommendation:

Analyze cause of Machine Check using diag's and EFI tools.

___________________________________________________________

Alert Level         = 7  - Fatal                  
Data Type           = 20 - Major change in state                     
                           For details, see 4.2.20 of Event Architecture Specification
Data                = 00 00 00 00 00 00 00 0B
_______________________________________________________________________________



IPMI Event Code: C151C23C67020310 003FA17000130300

Timestamp (GMT)   = Wed Jun 19 23:19:03 2013
Generator         = System Software ID = 0
Alert Level       = 7 - Fatal                  

Sensor                             
Number     Triplet      Data 2      Data 3
------    --------      ------      ------
   00      13:70:A1        3F          00

Decoding as system type: Ruby/Sapphire               

Sensor Number : 00 - Various, refer to manual        

Sensor Type   : 13 = Critical Interrupt

Event type    = Assertion event : OEM defined
    Keyword = MACHINE_CHECK_INITIATED
Machine Check initiated.  Processor = 00
Go to the EFI shell and do

    Shell> errdump mca 'filename'

where filename is any name you like.  This file can then be run
through the MCA analyser tool

A Machine Check Abort event means the hardware detected a critical error. This event is generated
whenever a system error due to processor, firmware, hardware and operating system is encountered.
MCA events may be either recoverable or non-recoverable. If it is recoverable, the system will
attempt to recover from the error for the purpose of maintaining high availability. An example of
which is automatic disabling of a failing processor. For non-recoverable errors, the system will
either stop or reboot to prevent data corruption and unreliable operation

你把MCA的文件和你的升级过程日志发过来看看呢




欢迎光临 Chinaunix (http://bbs.chinaunix.net/) Powered by Discuz! X3.2