免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 2547 | 回复: 0
打印 上一主题 下一主题

【分享】bad trap的错误,我总算找到解决的方法了 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2003-02-20 17:15 |只看该作者 |倒序浏览
可能是这个原因引起的
Solaris x86 (Intel Platform) Systems with Adaptec SCSI Controllers Might Hang Under Heavy Network Load

Sun(sm) Alert Notification
Sun Alert ID: 26354
Synopsis: Solaris x86 (Intel Platform) Systems with Adaptec SCSI Controllers Might Hang Under Heavy Network Load
Category: Data Loss, Availability
Product: Solaris
BugIDs: 4363919, 4405440
Avoidance: Patch, Upgrade
State: Resolved
Date Released: 11-Apr-2001, 17-May-2001
Date Closed: 17-May-2001
Date Modified: 11-Apr-2001, 24-Apr-2001, 11-May-2001, 17-May-2001
1. Impact
On Solaris x86 (Intel platform) systems equipped with an Adaptec SCSI controller, the "adp" SCSI driver might fail during heavy SCSI and network traffic, leading to a hung system and eventually causing a system panic.

Once the SCSI driver has failed, the system can not be brought back to normal operation without possibly losing data not already flushed to disks, potentially resulting in a corrupted file system after reboot.

2. Contributing Factors
This problem can occur in the following releases:

Intel

Solaris 2.5.1
Solaris 2.6 without patch 111031-01
Solaris 7 without patch 108055-03
Solaris 8 without patch 111334-01
The described issue will only be seen on systems with high network and SCSI activity going on at the same time.

With Solaris 2.6 x86, the problem is more likely to appear on multi processor systems.

Systems not using the latest network adapter drivers available are more likely to encounter the issue.

3. Symptoms
The system no longer responds and appears to be hung.

Should the described problem occur, SCSI timeout messages similar to the following might be seen on the console and in the "/var/adm/messages" file:

    Apr 06 15:14:04 sysa unix: WARNING: /pci@0,0/pci1011,1@b/pci9004,7178@6 (adp0):
    Feb 06 15:14:04 sysa unix:  timeout: scsi_abort request, target=2 lun=0
                                    
Eventually, the system might panic with a message similar to the following:

    BAD TRAP
    sched: Page Fault
    Kernel fault at addr=0x0, pte=0
    [...]                                    
with the value printed after "addr=" being either "0x0" or "0x4".

If the system is configured to write a crash dump during the panic, a "anic sync timeout" may additionally be encountered during the panic. On Solaris 2.6 x86 and Solaris 2.5.1 x86 based systems, this will stop the crash dump being written to disk.

Solution Summary Top

4. Relief/Workaround
As possible workarounds

update the network adapter drivers to their latest revision (available at http://www.sun.com/io_technologies/solaris-drivers.html)
use a SCSI adapter that does not use the "adp" driver (see the "adp" man page for a list of devices using the "adp" driver)
because Solaris 7 x86 and newer are more efficient at handling I/O interrupts, upgrading to Solaris 7 or newer will make the described issue less likely to be encountered
5. Resolution
This problem is addressed in the following releases:

Intel

Solaris 2.6 with patch 111031-01 or later
Solaris 7 with patch 108055-03 or later
Solaris 8 with patch 111334-01 or later
Systems running Solaris 2.5.1 x86 (Intel platform) should be upgraded to Solaris 2.6 or later with the appropriate patches to avoid the described issue.

Change History
24-Apr-2001

added BugID 4363919
11-May-2001

patch 111334-01 for Solaris 8 is available
17-May-2001

patch 108055-03 for Solaris 7 is available
State: Resolved
The issue described in this Sun(sm) Alert document may or may not be experienced by your particular system(s). The information in this Sun(sm) Alert document may be based upon information received from third-parties. It is being provided to you "AS IS", for informational purposes only. Sun does not make any representations, warranties, or guaranties as to the quality, suitability, truth, accuracy or completeness of any of the information. Sun shall not be liable for any losses or damages suffered as a result of Customer's use or non-use of the information.


http://sunsolve.sun.com/pub-cgi/retrieve.pl?doc=fsalert%2F26354&zone_32=disk%20timeout
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP