- 论坛徽章:
- 0
|
主机7026-m80。\r\n最近经常出现Crash。\r\nLED显示为 888 102 300 0c0 和 888 102 605 0c5\r\n我查过,意思为:\r\n“300 Data storage interrupt from the processor.\r\n0c5 Dump Did Not Start or Dump Crashed.”\r\n也就是说系统出现意外数据中断,导致Crash。\r\n\r\n重新启动后系统做errpt有如下报错。\r\nIDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION\r\n0BA49C99 1024183004 T H scsi1 SCSI BUS ERROR\r\n0BA49C99 1024182404 T H scsi1 SCSI BUS ERROR\r\n1581762B 1024181104 T H hdisk0 DISK OPERATION ERROR\r\n0BA49C99 1024181104 T H scsi0 SCSI BUS ERROR\r\n613E5F38 1024181104 P H LVDD I/O ERROR DETECTED BY LVM\r\n1581762B 1024181104 T H hdisk0 DISK OPERATION ERROR\r\n0BA49C99 1024181104 T H scsi0 SCSI BUS ERROR\r\n\r\nSCSI0和SCSI1(Wide/Ultra-2 SCSI I/O Controller)应该是连接内置硬盘和SCSI的IO设备的总线。\r\n根据经验判断,应该是系统SCSI线上的传输出了问题。\r\n\r\n对SCIS0、sysplanar0 、SCSI1、HDISK0、HDISK1做DIAG。\r\nVerify时没有报错。做Problem Determination时出现如下信息:\r\n1、Ref. code: B1194690\r\n2、62D-129: Error log analysis indicates a SCSI bus problem.\r\n SCSI bus problem: cables, terminators or other SCSI devices\r\n hdisk0 FRU: 07N3778 9.1GB\r\n PCI2 FRU: 04N6907\r\n ···\r\n ··\r\n ·\r\n\r\n根据(1)的解释“B1xx 4690 Service processor firmware/AIX\r\ninterface problem detected Call second level of support”,我查了系统的微码为“System Firmware: M2P01113 Plarform Firmware: MM010507”。于是把当前微码升级到最新。重新启动后errpt里依旧有新的类似错误信息。\r\n\r\n另外一台做HA的机器上也有类似SCSI1的报错。\r\n\r\n#############################################\r\n\r\n目前的主要问题就是对“0BA49C99 1024181104 T H scsi0 SCSI BUS ERROR”这样的问题束手无策。我以前也碰到过几次类似的报错,但没这么频繁过。以前报错以后甚至偶尔会出现内置硬盘找不到的情况。真不知道该怎么解决。\r\n\r\n  |
|