- 论坛徽章:
- 0
|
各位好!
2台P550机器.共同连接一台TAST T600磁盘阵列.每台主机配置了2块HBA卡,但是各有一块HBA卡连接到TAST T600上.机器4月20日安装完成,5月13日左右开始,每分钟有3-4次左右报错:
#root:/>errpt | more
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
C86ACB7E 0529213706 I H hdisk3 ARRAY CONFIGURATION CHANGED
0148FAED 0529213606 I H dac0 SINGLE CONTROLLER RESTARTED
C86ACB7E 0529213606 I H hdisk3 ARRAY CONFIGURATION CHANGED
0148FAED 0529213606 I H dac0 SINGLE CONTROLLER RESTARTED
C86ACB7E 0529213606 I H hdisk3 ARRAY CONFIGURATION CHANGED
0148FAED 0529213506 I H dac0 SINGLE CONTROLLER RESTARTED
C86ACB7E 0529213506 I H hdisk3 ARRAY CONFIGURATION CHANGED
0148FAED 0529213406 I H dac0 SINGLE CONTROLLER RESTARTED
C86ACB7E 0529213406 I H hdisk3 ARRAY CONFIGURATION CHANGED
0148FAED 0529213406 I H dac0 SINGLE CONTROLLER RESTARTED
C86ACB7E 0529213306 I H hdisk3 ARRAY CONFIGURATION CHANGED
0148FAED 0529213306 I H dac0 SINGLE CONTROLLER RESTARTED
C86ACB7E 0529213206 I H hdisk3 ARRAY CONFIGURATION CHANGED
0148FAED 0529213106 I H dac0 SINGLE CONTROLLER RESTARTED
C86ACB7E 0529213106 I H hdisk3 ARRAY CONFIGURATION CHANGED
0148FAED 0529213106 I H dac0 SINGLE CONTROLLER RESTARTED
C86ACB7E 0529213006 I H hdisk3 ARRAY CONFIGURATION CHANGED
0148FAED 0529213006 I H dac0 SINGLE CONTROLLER RESTARTED
查看详细报错信息:
#root:/>errpt -aj 0148FAED | more
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR27
IDENTIFIER: 0148FAED
Date/Time: Mon May 29 21:38:58 BEIST 2006
Sequence Number: 33016
Machine Id: 000A593AD600
Node Id: sapdev
Class: H
Type: INFO
Resource Name: dac0
Resource Class: array
Resource Type: ibm-dac-V4
Location: U787B.001.DNW84AD-P1-C2-T1-W200500A0B821040C
VPD:
Manufacturer................IBM
Machine Type and Model......1722-600
Part Number.................12844-00
ROS Level and ID............0520
Description
SINGLE CONTROLLER RESTARTED
Probable Causes
A COMMUNICATION OR HARDWARE PROBLEM REPAIRED
LUN MOVED TO A CONTROLLER WITHOUT A PATH
User Causes
ONE CONTROLLER DECONFIGURED BY USER
Recommended Actions
IF THIS IS A DUAL CONTROLLER, IT IS IN A
NON-REDUNDANT CONFIGURATION, RECONFIGURE
THE DAC WHEN POSSIBLE
Failure Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS
Recommended Actions
NO ACTION NECESSARY
Detail Data
SENSE DATA
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0400 00EE 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 2D3C 9000 F705 3207 0000 0000 0000 0003 0000 0000 E400 0000 0000 0003
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR27
IDENTIFIER: 0148FAED
#root:/>errpt -aj C86ACB7E | more
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR10
IDENTIFIER: C86ACB7E
Date/Time: Mon May 29 21:39:47 BEIST 2006
Sequence Number: 33017
Machine Id: 000A593AD600
Node Id: sapdev
Class: H
Type: INFO
Resource Name: hdisk3
Resource Class: disk
Resource Type: array
Location: U787B.001.DNW84AD-P1-C2-T1-W200500A0B821040C-L1000000000000
Description
ARRAY CONFIGURATION CHANGED
Probable Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS
Failure Causes
ARRAY CONTROLLER
CABLES AND CONNECTIONS
Recommended Actions
NO ACTION NECESSARY
Detail Data
SENSE DATA
0600 1600 0000 0000 0000 0000 0000 0000 0000 0000 0000 19AA 0102 0000 7000 0500
0000 0098 0000 0000 9401 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0000
0002 1600 0016 0000 0000 0000 0000 0000 0000 3154 3630 3236 3235 3738 2020 2020
2020 0612 1600 0001 0000 0600 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0005 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 001B 82F2 3035 3239 3036 2F30 3733 3234 3900 0000 0000 0000 0000 0000
0000 0000 2E06 B000 F705 3207 0000 0000 0000 0000 0000 0000 E400 FFFF 0000 0003
0000 0000
---------------------------------------------------------------------------
LABEL: FCP_ARRAY_ERR10
IDENTIFIER: C86ACB7E
用iostat 1查看,iowait为26%左右,但是任何一个硬盘都没有读写忙的信息:
tty: tin tout avg-cpu: % user % sys % idle % iowait
0.0 709.0 0.1 0.2 73.8 26.0
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk1 0.0 0.0 0.0 0 0
hdisk0 0.0 0.0 0.0 0 0
dac0 0.0 0.0 0.0 0 0
dac0-utm 0.0 0.0 0.0 0 0
hdisk2 0.0 0.0 0.0 0 0
hdisk3 0.0 0.0 0.0 0 0
hdisk4 0.0 0.0 0.0 0 0
cd0 0.0 0.0 0.0 0 0
tty: tin tout avg-cpu: % user % sys % idle % iowait
0.0 695.1 0.1 0.1 74.2 25.6
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk1 0.0 0.0 0.0 0 0
hdisk0 0.0 0.0 0.0 0 0
dac0 0.0 0.0 0.0 0 0
dac0-utm 0.0 0.0 0.0 0 0
hdisk2 0.0 0.0 0.0 0 0
hdisk3 0.0 0.0 0.0 0 0
hdisk4 0.0 0.0 0.0 0 0
cd0 0.0 0.0 0.0 0 0
查看dac;信息如下
#root:/usr/ucb>fget_config -l dar0
dac0 ACTIVE dacNONE ACTIVE
hdisk2 dac0
hdisk3 dac0
hdisk4 dac0
#root:/usr/ucb>fget_config -l dar0
dac0 ACTIVE dacNONE ACTIVE
hdisk2 dacNONE
hdisk3 dacNONE
hdisk4 dacNONE
#root:/usr/ucb>fget_config -l dar0
dac0 ACTIVE dacNONE ACTIVE
hdisk2 dacNONE
hdisk3 dacNONE
hdisk4 dacNONE
dac0 ACTIVE dacNONE ACTIVE
hdisk2 dac0
hdisk3 dac0
hdisk4 dac0
不知道上面的信息是什么意思?由于运行的是公司的ERP系统(SAP+DB2),现在整个系统运行非常缓慢,从系统资源上看,除了iowait到达26%左右,CPU\内存\SWAP都没有问题.数据库的BUFFER命中率94%以上,SAP中的内存管理中也没有瓶径,请各位大侠帮助分析一下原因,最好能给出解决办法.
十分感谢! |
|