Chinaunix

标题: M80宕机,请帮忙查找原因! [打印本页]

作者: 腻水的飞鱼    时间: 2007-11-14 19:42
标题: M80宕机,请帮忙查找原因!
1台M80主机突然宕机,面板显示888 102 207 0C0。面板显示好像是说Dump完成了。重启后查看errpt发现有
LABEL:                MACHINE_CHECK_CHRP
IDENTIFIER:        56CDC3C8

Date/Time:       Mon Nov 12 12:25:32
Sequence Number: 106273
Machine Id:      000CF87F4C00
Node Id:         localhost
Class:           H
Type:            PERM
Resource Name:   sysplanar0
Resource Class:  planar
Resource Type:   sysplanar_rspc
Location:        00-00

Description
MACHINE CHECK

Probable Causes
UNDETERMINED

Failure Causes
PROCESSOR MACHINE CHECK

        Recommended Actions
        RUN SYSTEM DIAGNOSTICS.

Detail Data
MACHINE STATUS SAVE/RESTORE REGISTER 0
0000 0000 0002 5604
MACHINE STATUS SAVE/RESTORE REGISTER 1

--------------------------------------------------------------------------------
LABEL:                SCAN_ERROR_CHRP
IDENTIFIER:        BFE4C025

Date/Time:       Tue Nov 13 09:00:04
Sequence Number: 106274
Machine Id:      000CF87F4C00
Node Id:         localhost
Class:           H
Type:            PERM
Resource Name:   sysplanar0
Resource Class:  planar
Resource Type:   sysplanar_rspc
Location:        00-00

Description
UNDETERMINED ERROR

Failure Causes
UNDETERMINED

        Recommended Actions
        RUN SYSTEM DIAGNOSTICS.-
-------------------------------------------------------------------------------
root mail 里报
A PROBLEM WAS DETECTED ON Tue Nov 13 09:02:07 TAIST 2007                  801014

The Service Request Number(s)/Probable Cause(s)
(causes are listed in descending order of probability):

  651-880: The CEC or SPCN reported an error. Report the SRN and the
           following reference and physical location codes to your service
           provider.
           Error log information:
                 Sequence number: 106274
    Ref. Code: B1194690 FRU: n/a              n/a            


作diag诊断没有发现错误,请大家帮忙分析下是什么原因导致宕机。谢谢
作者: 金牌小卧底    时间: 2007-11-14 20:36
B1194690的报错说明 :

1. This error code indicates that the operating system terminated early (which usually implies an operating system crash). This error code may appear in the service processor error log by itself. However, in the AIX error log, there should be another error which points to the cause of the operating system crash. Use the other error as the starting point for your service action.
2. The other possibility is that the operating system was not found during a prior boot attempt. To determine if this occurred, do the following: Look at the AIX error log entry containing B1xx4690. This will be a ″SCAN_ERROR_CHRP″ error with an identifier of BFEC0425. In the detail data, find the string ″B1xx4690″ (If present, it will be at byte 60 of the detail data.) Then go forward 8 bytes after the ″B1″ to byte 68 and look at bytes 68 and 69. If the values of bytes 68 and 69 are A2B0, this indicates that the firmware was unable to find a bootable device in the boot list that is set in the SMS menus. If the system is up, the boot list problem has been corrected and the B1xx 4690 can be treated as an informational message with no actions required.
3. Call service support.

从上面的分析来看 , 我认为现在的主要问题还是放在研究56CDC3C8 这个错误代码上 .
容我再查一查 . 大家一起想一想
LZ能否把该机器的系统版本 以及 跑的app说明一下 ?

[ 本帖最后由 金牌小卧底 于 2007-11-14 20:46 编辑 ]
作者: 腻水的飞鱼    时间: 2007-11-14 20:49
OS 4330-09   oracle数据库服务器。谢谢LS的。
作者: 金牌小卧底    时间: 2007-11-14 20:53
查到了 .
楼主你看下这个链接
http://www-1.ibm.com/support/docview.wss?uid=isg1IY32808

应该是系统bug导致 时钟延迟而引起的异常down机的问题 .

但是这个apar是for AIX 5.1 的  ..

我也有点晕呼了.

[ 本帖最后由 金牌小卧底 于 2007-11-14 20:55 编辑 ]
作者: yddll    时间: 2007-11-14 20:55
同意2楼,B1194690应该只是结果
作者: 金牌小卧底    时间: 2007-11-14 20:58
现在看来得把注意力放在怎么解决这个
Correct timing delay in driver to enforce I/O serialization
这个问题上来 .
我再想想....

现在我能想到的一个解决的建议就是: 如果可以的情况下 , 将4.3.3.09升到4.3的最高版本11 . 估计还是个关于时钟延迟的bug问题.
现在不确定4.3这个系统的fix是否真的解决了这个问题.  
上面看到的是 5.1的解决问题 ..  最好能找IBM确认 在ML11上 解决了这个问题 . 不然还会发生..
希望其他大侠来讨论下 .

[ 本帖最后由 金牌小卧底 于 2007-11-14 21:10 编辑 ]
作者: yddll    时间: 2007-11-14 21:23
老机器了,修修补补的,尽人事吧
作者: 腻水的飞鱼    时间: 2007-11-14 21:29
谢谢金牌小卧底 和yddll。写个报告建议升微码和系统,尽尽人事。
作者: 金牌小卧底    时间: 2007-11-14 21:34
尽尽人事.. 哈哈 经典




欢迎光临 Chinaunix (http://bbs.chinaunix.net/) Powered by Discuz! X3.2