- 论坛徽章:
- 0
|
各位好!\r\n\r\n2台P550机器.共同连接一台TAST T600磁盘阵列.每台主机配置了2块HBA卡,但是各有一块HBA卡连接到TAST T600上.机器4月20日安装完成,5月13日左右开始,每分钟有3-4次左右报错:\r\n\r\n#root:/>errpt | more\r\nIDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION\r\nC86ACB7E 0529213706 I H hdisk3 ARRAY CONFIGURATION CHANGED\r\n0148FAED 0529213606 I H dac0 SINGLE CONTROLLER RESTARTED\r\nC86ACB7E 0529213606 I H hdisk3 ARRAY CONFIGURATION CHANGED\r\n0148FAED 0529213606 I H dac0 SINGLE CONTROLLER RESTARTED\r\nC86ACB7E 0529213606 I H hdisk3 ARRAY CONFIGURATION CHANGED\r\n0148FAED 0529213506 I H dac0 SINGLE CONTROLLER RESTARTED\r\nC86ACB7E 0529213506 I H hdisk3 ARRAY CONFIGURATION CHANGED\r\n0148FAED 0529213406 I H dac0 SINGLE CONTROLLER RESTARTED\r\nC86ACB7E 0529213406 I H hdisk3 ARRAY CONFIGURATION CHANGED\r\n0148FAED 0529213406 I H dac0 SINGLE CONTROLLER RESTARTED\r\nC86ACB7E 0529213306 I H hdisk3 ARRAY CONFIGURATION CHANGED\r\n0148FAED 0529213306 I H dac0 SINGLE CONTROLLER RESTARTED\r\nC86ACB7E 0529213206 I H hdisk3 ARRAY CONFIGURATION CHANGED\r\n0148FAED 0529213106 I H dac0 SINGLE CONTROLLER RESTARTED\r\nC86ACB7E 0529213106 I H hdisk3 ARRAY CONFIGURATION CHANGED\r\n0148FAED 0529213106 I H dac0 SINGLE CONTROLLER RESTARTED\r\nC86ACB7E 0529213006 I H hdisk3 ARRAY CONFIGURATION CHANGED\r\n0148FAED 0529213006 I H dac0 SINGLE CONTROLLER RESTARTED\r\n\r\n查看详细报错信息:\r\n#root:/>errpt -aj 0148FAED | more\r\n---------------------------------------------------------------------------\r\nLABEL: FCP_ARRAY_ERR27\r\nIDENTIFIER: 0148FAED\r\n\r\nDate/Time: Mon May 29 21:38:58 BEIST 2006\r\nSequence Number: 33016\r\nMachine Id: 000A593AD600\r\nNode Id: sapdev\r\nClass: H\r\nType: INFO\r\nResource Name: dac0 \r\nResource Class: array\r\nResource Type: ibm-dac-V4\r\nLocation: U787B.001.DNW84AD-P1-C2-T1-W200500A0B821040C\r\nVPD: \r\n Manufacturer................IBM \r\n Machine Type and Model......1722-600 \r\n Part Number.................12844-00 \r\n ROS Level and ID............0520\r\n\r\nDescription\r\nSINGLE CONTROLLER RESTARTED\r\n\r\nProbable Causes\r\nA COMMUNICATION OR HARDWARE PROBLEM REPAIRED\r\nLUN MOVED TO A CONTROLLER WITHOUT A PATH\r\n\r\nUser Causes\r\nONE CONTROLLER DECONFIGURED BY USER\r\n\r\n Recommended Actions\r\n IF THIS IS A DUAL CONTROLLER, IT IS IN A\r\n NON-REDUNDANT CONFIGURATION, RECONFIGURE\r\n THE DAC WHEN POSSIBLE\r\n\r\nFailure Causes\r\nARRAY CONTROLLER\r\nCABLES AND CONNECTIONS\r\n\r\n Recommended Actions\r\n NO ACTION NECESSARY\r\n\r\nDetail Data\r\nSENSE DATA\r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0400 00EE 0000 0000 \r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 2D3C 9000 F705 3207 0000 0000 0000 0003 0000 0000 E400 0000 0000 0003 \r\n0000 0000 \r\n---------------------------------------------------------------------------\r\nLABEL: FCP_ARRAY_ERR27\r\nIDENTIFIER: 0148FAED\r\n\r\n\r\n\r\n#root:/>errpt -aj C86ACB7E | more\r\n---------------------------------------------------------------------------\r\nLABEL: FCP_ARRAY_ERR10\r\nIDENTIFIER: C86ACB7E\r\n\r\nDate/Time: Mon May 29 21:39:47 BEIST 2006\r\nSequence Number: 33017\r\nMachine Id: 000A593AD600\r\nNode Id: sapdev\r\nClass: H\r\nType: INFO\r\nResource Name: hdisk3 \r\nResource Class: disk\r\nResource Type: array\r\nLocation: U787B.001.DNW84AD-P1-C2-T1-W200500A0B821040C-L1000000000000\r\n\r\nDescription\r\nARRAY CONFIGURATION CHANGED\r\n\r\nProbable Causes\r\nARRAY CONTROLLER\r\nCABLES AND CONNECTIONS\r\n\r\nFailure Causes\r\nARRAY CONTROLLER\r\nCABLES AND CONNECTIONS\r\n\r\n Recommended Actions\r\n NO ACTION NECESSARY\r\n\r\nDetail Data\r\nSENSE DATA\r\n0600 1600 0000 0000 0000 0000 0000 0000 0000 0000 0000 19AA 0102 0000 7000 0500 \r\n0000 0098 0000 0000 9401 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0000 \r\n0002 1600 0016 0000 0000 0000 0000 0000 0000 3154 3630 3236 3235 3738 2020 2020 \r\n2020 0612 1600 0001 0000 0600 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0005 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 001B 82F2 3035 3239 3036 2F30 3733 3234 3900 0000 0000 0000 0000 0000 \r\n0000 0000 2E06 B000 F705 3207 0000 0000 0000 0000 0000 0000 E400 FFFF 0000 0003 \r\n0000 0000 \r\n---------------------------------------------------------------------------\r\nLABEL: FCP_ARRAY_ERR10\r\nIDENTIFIER: C86ACB7E\r\n\r\n\r\n用iostat 1查看,iowait为26%左右,但是任何一个硬盘都没有读写忙的信息:\r\ntty: tin tout avg-cpu: % user % sys % idle % iowait\r\n 0.0 709.0 0.1 0.2 73.8 26.0\r\n\r\nDisks: % tm_act Kbps tps Kb_read Kb_wrtn\r\nhdisk1 0.0 0.0 0.0 0 0\r\nhdisk0 0.0 0.0 0.0 0 0\r\ndac0 0.0 0.0 0.0 0 0\r\ndac0-utm 0.0 0.0 0.0 0 0\r\nhdisk2 0.0 0.0 0.0 0 0\r\nhdisk3 0.0 0.0 0.0 0 0\r\nhdisk4 0.0 0.0 0.0 0 0\r\ncd0 0.0 0.0 0.0 0 0\r\n\r\ntty: tin tout avg-cpu: % user % sys % idle % iowait\r\n 0.0 695.1 0.1 0.1 74.2 25.6\r\n\r\nDisks: % tm_act Kbps tps Kb_read Kb_wrtn\r\nhdisk1 0.0 0.0 0.0 0 0\r\nhdisk0 0.0 0.0 0.0 0 0\r\ndac0 0.0 0.0 0.0 0 0\r\ndac0-utm 0.0 0.0 0.0 0 0\r\nhdisk2 0.0 0.0 0.0 0 0\r\nhdisk3 0.0 0.0 0.0 0 0\r\nhdisk4 0.0 0.0 0.0 0 0\r\ncd0 0.0 0.0 0.0 0 0\r\n\r\n查看dac;信息如下\r\n#root:/usr/ucb>fget_config -l dar0\r\ndac0 ACTIVE dacNONE ACTIVE\r\nhdisk2 dac0 \r\nhdisk3 dac0 \r\nhdisk4 dac0 \r\n#root:/usr/ucb>fget_config -l dar0\r\ndac0 ACTIVE dacNONE ACTIVE\r\nhdisk2 dacNONE\r\nhdisk3 dacNONE\r\nhdisk4 dacNONE\r\n#root:/usr/ucb>fget_config -l dar0\r\ndac0 ACTIVE dacNONE ACTIVE\r\nhdisk2 dacNONE\r\nhdisk3 dacNONE\r\nhdisk4 dacNONE\r\ndac0 ACTIVE dacNONE ACTIVE\r\nhdisk2 dac0 \r\nhdisk3 dac0 \r\nhdisk4 dac0 \r\n\r\n\r\n不知道上面的信息是什么意思?由于运行的是公司的ERP系统(SAP+DB2),现在整个系统运行非常缓慢,从系统资源上看,除了iowait到达26%左右,CPU\\内存\\SWAP都没有问题.数据库的BUFFER命中率94%以上,SAP中的内存管理中也没有瓶径,请各位大侠帮助分析一下原因,最好能给出解决办法.\r\n\r\n十分感谢! |
|