免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 3600 | 回复: 0
打印 上一主题 下一主题

如何判断一个恐慌是由硬件还是软件引起的? [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2007-10-06 23:16 |只看该作者 |倒序浏览
HP UNIX:如何判断一个恐慌是由硬件还是软件引起的?
2006-6-8 14:14:51 equalnull 来源:HP 点击:801次 加入收藏夹



如何判断一个恐慌是由硬件还是软件引起的?

问题描述

当由于恐慌 (Panic) 而发生重新引导时,其原因可能与硬件相关也可能与
软件/操作系统相关。如果与硬件相关,通过辨识原因为硬件 (如 HPMC) 并
获得相关的相应硬件资源,可以避免花费在分析崩溃转储上的不必要时间。

配置信息


解决方法

要确定恐慌是由硬件还是软件引起的,第一步是检查 shutdownlog 或转储 INDEX
文件中的恐慌消息:

tail /etc/shutdownlog

或者在转储 core.X (10.X) 或 crash.X (11.X) 目录中:

more INDEX

如果 shutdownlog 中与恐慌相对应的条目为:
Reboot after panic: , isr.ior = X’X.Y’Y
请参阅下面的注意 1或Reboot after panic: trap type 1 (HPMC)

或者如果 INDEX 文件中恐慌行为:
panic , isr.ior = X’X.Y’Y
请参阅下面的注意 1
或panic trap type 1 (HPMC)
则可能发生了 HPMC(High Priority Machine Check),需要打开一个硬件服务呼叫。

注意 1: 如果系统正在运行 MC Serviceguard 或者由于操作员引发了 TOC (Transfer of Control) 导致了系统重新引导,则也会出现这个消息,同时可能需要一个崩溃转储分析。

注意 2: 绝大多数 HPMC 与硬件原因相关,但也有一些例外情况,一旦分析了机箱代码之后,硬件支持就可能要求执行崩溃转储分析。

某些 S800 服务器支持在线收集 HPMC PIM (Processor Internal Memory) 信息,
并且组合了已安装诊断程序和一个名为 pdcinfo 的最新版本实用程序,该实用程序
会将硬件故障信息写入一个名为 /var/tombstones/ts99 的文件中。此信息由
硬件服务部门进行分析。可能需要运行在线诊断程序或者重新引导,以及从引导或
服务菜单获得硬件故障 (机箱) 代码。在 V-class 及 N-class 服务器中,HPMC
和硬件故障信息是使用不同的实用程序获得的。硬件服务有助于这些操作。
.........以下为英文原文 ....

UXDNKBRC00001764
How can I tell if a panic was caused by hardware or software?
Problem Description

When a reboot due to a panic occurs, the cause could be hardware related or
software/OS related. If it is hardware related, unecessary time spent
analyzing a crash dump can be avoided by identifying the cause as hardware(i.e.
HPMC) and getting the appropriate hardware resources involved.



Configuration Info

Solution

The first step in determining if a panic is caused by hardware vs. software is
to check the panic message in the shutdownlog or the dump INDEX file:

tail /etc/shutdownlog

or from the dump core.X (10.X) or crash.X (11.X) directory:

more INDEX

If the entry in the shutdownlog corresponding to the panic is:
Reboot after panic: , isr.ior = X’X.Y’Y see NOTE 1 below
or
Reboot after panic: trap type 1 (HPMC)

or if the panic line from the INDEX file is:
panic , isr.ior = X’X.Y’Y see NOTE 1 below
or
panic trap type 1 (HPMC)

It is likely than an HPMC(High Priority Machine Check) has occurred and a
hardware service call should be opened.

NOTE 1: If the system is running MC Serviceguard or if the system
rebooted due to an operator induced TOC(Transfer of Control)
this message will also appear, and a crash dump analysis may
still be required.

NOTE 2: The vast majority of HPMC’s are related to hardware causes. There
are a few exceptions and once the chassis codes are analyzed,
hardware support may request that a crash dump analysis be performed.


Certain S800 servers support online collection of HPMC PIM(Processor Internal
Memory) information and combined with installed diagnostics and a current
version of a utility called pdcinfo, will write hardware fault
information to a file called: /var/tombstones/ts99. This information
can be analyzed by hardware service. It may be necessary to run online
diagnostics or reboot the system and obtain the hardware fault(chassis) codes
from a boot or service menu. On V-class and N-class servers, HPMC and hardware
fault information is obtained with different utilities. Hardware service can
assist with these operations.

(the end)
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP