Chinaunix
标题:
P630宕机了,大家帮忙找一下原因。
[打印本页]
作者:
tony8201
时间:
2005-07-13 17:41
标题:
P630宕机了,大家帮忙找一下原因。
11号晚上11点a机宕机,应用自动切换到b机上,管理员不知道,没有马上启动a机。12号晚上7点b机宕机,应用停止,管理员发现后把a机和b机启动,到现在一切正常。不明白为什么会宕机???\r\n\r\na机错误信息:\r\n#errpt\r\nIDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION\r\n2F3E09A4 0713144505 I H ent2 REPAIR ACTION\r\nB6048838 0713112205 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED\r\nB6048838 0713111905 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED\r\nBA431EB7 0712201705 P S SRC SOFTWARE PROGRAM ERROR\r\nB6048838 0712201705 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED\r\nBA431EB7 0712201505 P S SRC SOFTWARE PROGRAM ERROR\r\nB6048838 0712201505 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED\r\nAFA89905 0712201205 I O grpsvcs Group Services daemon started\r\n97419D60 0712201205 I O topsvcs Topology Services daemon started\r\nA6DF45AA 0712200705 I O RMCdaemon The daemon is started.\r\nBFE4C025 0712200505 P H sysplanar0 UNDETERMINED ERROR\r\n2BFA76F6 0711234905 T S SYSPROC SYSTEM SHUTDOWN BY USER\r\n9DBCFDEE 0712200705 T O errdemon ERROR LOGGING TURNED ON\r\nFE2DEE00 0711234905 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET\r\nFE2DEE00 0711234905 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET\r\nAA8AB241 0711234905 T O OPERATOR OPERATOR NOTIFICATION\r\nBC3BE5A3 0711234905 P S SRC SOFTWARE PROGRAM ERROR\r\nB6048838 0624162405 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED\r\nB6048838 0624161705 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED\r\nB6048838 0624160905 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED\r\nD5385D18 0505224805 T H hdisk2 ARRAY OPERATION ERROR\r\n3C81E43F 0427114005 P U topsvcs Late in sending heartbeat
作者:
tony8201
时间:
2005-07-13 17:42
标题:
P630宕机了,大家帮忙找一下原因。
#errpt -a\r\nLABEL: REBOOT_ID\r\nIDENTIFIER: 2BFA76F6\r\n\r\nDate/Time: Mon Jul 11 23:49:50 BEIS\r\nSequence Number: 180\r\nMachine Id: 00570B1E4C00\r\nNode Id: localhost\r\nClass: S\r\nType: TEMP\r\nResource Name: SYSPROC \r\n\r\nDescription\r\nSYSTEM SHUTDOWN BY USER\r\n\r\nProbable Causes\r\nSYSTEM SHUTDOWN\r\n\r\nDetail Data\r\nUSER ID\r\n 0\r\n0=SOFT IPL 1=HALT 2=TIME REBOOT\r\n 1\r\nTIME TO REBOOT (FOR TIMED REBOOT ONLY)\r\n 0\r\n---------------------------------------------------------------------------\r\nLABEL: ERRLOG_ON\r\nIDENTIFIER: 9DBCFDEE\r\n\r\nDate/Time: Tue Jul 12 20:07:17 BEIS\r\nSequence Number: 179\r\nMachine Id: 00570B1E4C00\r\nNode Id: localhost\r\nClass: O\r\nType: TEMP\r\nResource Name: errdemon \r\n\r\nDescription\r\nERROR LOGGING TURNED ON\r\n\r\nProbable Causes\r\nERRDEMON STARTED AUTOMATICALLY\r\n\r\nUser Causes\r\n/USR/LIB/ERRDEMON COMMAND\r\n\r\n Recommended Actions\r\n NONE\r\n\r\n---------------------------------------------------------------------------\r\nLABEL: AIXIF_ARP_DUP_ADDR\r\nIDENTIFIER: FE2DEE00\r\n\r\nDate/Time: Mon Jul 11 23:49:45 BEIS\r\nSequence Number: 178\r\nMachine Id: 00570B1E4C00\r\nNode Id: xbserver1\r\nClass: S\r\nType: PERM\r\nResource Name: SYSXAIXIF \r\n\r\nDescription\r\nDUPLICATE IP ADDRESS DETECTED IN THE NET\r\n\r\nFailure Causes\r\nARP RESPONSE RECEIVED FOR MY IP ADDRESS\r\n\r\n Recommended Actions\r\n CONTACT NETWORK ADMINISTRATOR\r\n\r\nDetail Data\r\nDUPLICATE IP ADDRESS\r\n0A67 0103 \r\nMAC ADDRESS\r\n000D 600B 8DE2 \r\n---------------------------------------------------------------------------\r\nLABEL: AIXIF_ARP_DUP_ADDR\r\nIDENTIFIER: FE2DEE00\r\n\r\nDate/Time: Mon Jul 11 23:49:44 BEIS\r\nSequence Number: 177\r\nMachine Id: 00570B1E4C00\r\nNode Id: xbserver1\r\nClass: S\r\nType: PERM\r\nResource Name: SYSXAIXIF \r\n\r\nDescription\r\nDUPLICATE IP ADDRESS DETECTED IN THE NET\r\n\r\nFailure Causes\r\nARP RESPONSE RECEIVED FOR MY IP ADDRESS\r\n\r\n Recommended Actions\r\n CONTACT NETWORK ADMINISTRATOR\r\n\r\nDetail Data\r\nDUPLICATE IP ADDRESS\r\n0A67 0103 \r\nMAC ADDRESS\r\n000D 600B 8DE2 \r\n---------------------------------------------------------------------------\r\nLABEL: OPMSG\r\nIDENTIFIER: AA8AB241\r\n\r\nDate/Time: Mon Jul 11 23:49:43 BEIS\r\nSequence Number: 176\r\nMachine Id: 00570B1E4C00\r\nNode Id: xbserver1\r\nClass: O\r\nType: TEMP\r\nResource Name: OPERATOR \r\n\r\nDescription\r\nOPERATOR NOTIFICATION\r\n\r\nUser Causes\r\nERRLOGGER COMMAND\r\n\r\n Recommended Actions\r\n REVIEW DETAILED DATA\r\n\r\nDetail Data\r\nMESSAGE FROM ERRLOGGER COMMAND\r\nclexit.rc : Unexpected termination of clstrmgrES\r\n---------------------------------------------------------------------------\r\nLABEL: SRC_SVKO\r\nIDENTIFIER: BC3BE5A3\r\n\r\nDate/Time: Mon Jul 11 23:49:42 BEIS\r\nSequence Number: 175\r\nMachine Id: 00570B1E4C00\r\nNode Id: xbserver1\r\nClass: S\r\nType: PERM\r\nResource Name: SRC \r\n\r\nDescription\r\nSOFTWARE PROGRAM ERROR\r\n\r\nProbable Causes\r\nAPPLICATION PROGRAM\r\n\r\nFailure Causes\r\nSOFTWARE PROGRAM\r\n\r\n Recommended Actions\r\n MANUALLY RESTART SUBSYSTEM IF NEEDED\r\n\r\nDetail Data\r\nSYMPTOM CODE\r\n 512\r\nSOFTWARE ERROR CODE\r\n -9017\r\nERROR CODE\r\n 0\r\nDETECTING MODULE\r\n\'srchevn.c\'@line:\'350\'\r\nFAILING MODULE\r\nclstrmgrES\r\n---------------------------------------------------------------------------\r\nLABEL: CORE_DUMP\r\nIDENTIFIER: B6048838\r\n\r\nDate/Time: Fri Jun 24 16:24:34 BEIS\r\nSequence Number: 174\r\nMachine Id: 00570B1E4C00\r\nNode Id: xbserver1\r\nClass: S\r\nType: PERM\r\nResource Name: SYSPROC \r\n\r\nDescription\r\nSOFTWARE PROGRAM ABNORMALLY TERMINATED\r\n\r\nProbable Causes\r\nSOFTWARE PROGRAM\r\n\r\nUser Causes\r\nUSER GENERATED SIGNAL\r\n\r\n Recommended Actions\r\n CORRECT THEN RETRY\r\n\r\nFailure Causes\r\nSOFTWARE PROGRAM\r\n\r\n Recommended Actions\r\n RERUN THE APPLICATION PROGRAM\r\n IF PROBLEM PERSISTS THEN DO THE FOLLOWING\r\n CONTACT APPROPRIATE SERVICE REPRESENTATIVE\r\n\r\nDetail Data\r\nSIGNAL NUMBER\r\n 11\r\nUSER\'S PROCESS ID:\r\n 16234\r\nFILE SYSTEM SERIAL NUMBER\r\n 11\r\nINODE NUMBER\r\n 4098\r\nPROCESSOR ID\r\n 0\r\nCORE FILE NAME\r\n/xbstation/xb/server/bin/core\r\nPROGRAM NAME\r\nDept_Dispose\r\nADDITIONAL INFORMATION\r\n??\r\n??\r\nUnable to generate symptom string.\r\n---------------------------------------------------------------------------\r\nLABEL: CORE_DUMP\r\nIDENTIFIER: B6048838\r\n\r\nDate/Time: Fri Jun 24 16:17:42 BEIS\r\nSequence Number: 173\r\nMachine Id: 00570B1E4C00\r\nNode Id: xbserver1\r\nClass: S\r\nType: PERM\r\nResource Name: SYSPROC \r\n\r\nDescription\r\nSOFTWARE PROGRAM ABNORMALLY TERMINATED\r\n\r\nProbable Causes\r\nSOFTWARE PROGRAM\r\n\r\nUser Causes\r\nUSER GENERATED SIGNAL\r\n\r\n Recommended Actions\r\n CORRECT THEN RETRY\r\n\r\nFailure Causes\r\nSOFTWARE PROGRAM\r\n\r\n Recommended Actions\r\n RERUN THE APPLICATION PROGRAM\r\n IF PROBLEM PERSISTS THEN DO THE FOLLOWING\r\n CONTACT APPROPRIATE SERVICE REPRESENTATIVE\r\n\r\nDetail Data\r\nSIGNAL NUMBER\r\n 11\r\nUSER\'S PROCESS ID:\r\n 11916\r\nFILE SYSTEM SERIAL NUMBER\r\n 11\r\nINODE NUMBER\r\n 4098\r\nPROCESSOR ID\r\n 1\r\nCORE FILE NAME\r\n/xbstation/xb/server/bin/core\r\nPROGRAM NAME\r\nDept_Dispose\r\nADDITIONAL INFORMATION\r\n_ptrgl 0\r\n??\r\nUnable to generate symptom string.\r\n---------------------------------------------------------------------------\r\nLABEL: CORE_DUMP\r\nIDENTIFIER: B6048838\r\n\r\nDate/Time: Fri Jun 24 16:09:35 BEIS\r\nSequence Number: 172\r\nMachine Id: 00570B1E4C00\r\nNode Id: xbserver1\r\nClass: S\r\nType: PERM\r\nResource Name: SYSPROC \r\n\r\nDescription\r\nSOFTWARE PROGRAM ABNORMALLY TERMINATED\r\n\r\nProbable Causes\r\nSOFTWARE PROGRAM\r\n\r\nUser Causes\r\nUSER GENERATED SIGNAL\r\n\r\n Recommended Actions\r\n CORRECT THEN RETRY\r\n\r\nFailure Causes\r\nSOFTWARE PROGRAM\r\n\r\n Recommended Actions\r\n RERUN THE APPLICATION PROGRAM\r\n IF PROBLEM PERSISTS THEN DO THE FOLLOWING\r\n CONTACT APPROPRIATE SERVICE REPRESENTATIVE\r\n\r\nDetail Data\r\nSIGNAL NUMBER\r\n 11\r\nUSER\'S PROCESS ID:\r\n 58854\r\nFILE SYSTEM SERIAL NUMBER\r\n 11\r\nINODE NUMBER\r\n 4098\r\nPROCESSOR ID\r\n 1\r\nCORE FILE NAME\r\n/xbstation/xb/server/bin/core\r\nPROGRAM NAME\r\nDept_Dispose\r\nADDITIONAL INFORMATION\r\n??\r\n??\r\nUnable to generate symptom string.\r\n---------------------------------------------------------------------------\r\nLABEL: FCP_ARRAY_ERR4\r\nIDENTIFIER: D5385D18\r\n\r\nDate/Time: Thu May 5 22:48:15 BEIS\r\nSequence Number: 163\r\nMachine Id: 00570B1E4C00\r\nNode Id: xbserver1\r\nClass: H\r\nType: TEMP\r\nResource Name: hdisk2 \r\nResource Class: disk\r\nResource Type: array\r\nLocation: U0.1-P2-I3/Q1-W200400A0B818048A-L0\r\n\r\nDescription\r\nARRAY OPERATION ERROR\r\n\r\nProbable Causes\r\nARRAY DASD MEDIA\r\nARRAY DASD DEVICE\r\n\r\nFailure Causes\r\nDASD MEDIA\r\nDISK DRIVE\r\n\r\n Recommended Actions\r\n PERFORM PROBLEM DETERMINATION PROCEDURES\r\n\r\nDetail Data\r\nSENSE DATA\r\n0A00 2E08 010E F888 0000 0804 0000 0000 0000 0000 0000 02E3 0200 0300 0000 0000 \r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 \r\n0000 0000 3354 8000 F205 2101 0000 0000 0000 0000 0000 0000 EF00 64B0 0000 0000 \r\n0000 0000 \r\n---------------------------------------------------------------------------\r\nLABEL: TS_LATEHB_PE\r\nIDENTIFIER: 3C81E43F\r\n\r\nDate/Time: Wed Apr 27 11:40:58 BEIS\r\nSequence Number: 122\r\nMachine Id: 00570B1E4C00\r\nNode Id: xbserver1\r\nClass: U\r\nType: PERF\r\nResource Name: topsvcs \r\nResource Class: NONE\r\nResource Type: NONE\r\nLocation: \r\nVPD: \r\n\r\nDescription\r\nLate in sending heartbeat\r\n\r\nProbable Causes\r\nHeavy CPU load\r\nSevere physical memory shortage\r\nHeavy I/O activities\r\n\r\nFailure Causes\r\nDaemon can not get required system resource\r\n\r\n Recommended Actions\r\n Reduce the system load\r\n\r\nDetail Data\r\nDETECTING MODULE\r\nrsct,bootstrp.C,1.184,4520 \r\nERROR ID \r\n6zESUw.8bkP0/CRG.22kN8....................\r\nREFERENCE CODE\r\n \r\nA heartbeat is late by the following number of seconds\r\n 3590
作者:
tony8201
时间:
2005-07-13 17:45
标题:
P630宕机了,大家帮忙找一下原因。
hacmp.out在此之前一直报这个错误:\r\n\r\n00:00000:00029:2005/07/11 11:45:58.25 kernel Cannot read, host process disconnected: 1116 spid: 29\r\n00:00000:00046:2005/07/11 11:47:15.19 kernel Cannot read, host process disconnected: 1672 spid: 46\r\n00:00000:00117:2005/07/11 13:11:36.77 kernel Cannot read, host process disconnected: 1496 spid: 117\r\n00:00000:00161:2005/07/11 13:11:52.10 kernel Cannot read, host process disconnected: 1120 spid: 161\r\n00:00000:00095:2005/07/11 13:12:32.84 kernel Cannot read, host process disconnected: 984 spid: 95\r\n00:00000:00063:2005/07/11 13:12:33.85 kernel Cannot read, host process disconnected: 192 spid: 63\r\n00:00000:00101:2005/07/11 13:13:09.28 kernel Cannot read, host process disconnected: 1552 spid: 101\r\n00:00000:00138:2005/07/11 13:13:10.47 kernel Cannot read, host process disconnected: 2012 spid: 138\r\n00:00000:00078:2005/07/11 13:35:21.32 kernel Cannot read, host process disconnected: 3712 spid: 78\r\n00:00000:00157:2005/07/11 13:41:52.56 kernel Cannot read, host process disconnected: 932 spid: 157\r\n00:00000:00100:2005/07/11 13:44:39.44 kernel Cannot read, host process disconnected: 1064 spid: 100\r\n00:00000:00048:2005/07/11 13:45:02.54 kernel Cannot read, host process disconnected: 636 spid: 48\r\n00:00000:00067:2005/07/11 13:48:46.76 kernel Cannot read, host process disconnected: 1980 spid: 67\r\n00:00000:00014:2005/07/11 13:50:45.35 kernel Cannot read, host process disconnected: 1056 spid: 14\r\n00:00000:00025:2005/07/11 13:52:13.68 kernel Cannot read, host process disconnected: 244 spid: 25\r\n00:00000:00050:2005/07/11 13:52:15.68 kernel Cannot read, host process disconnected: 480 spid: 50\r\n00:00000:00168:2005/07/11 14:27:16.55 kernel Cannot read, host process disconnected: 506 1240 spid: 168\r\n00:00000:00160:2005/07/11 14:27:45.20 kernel Cannot read, host process disconnected: 506 1916 spid: 160\r\n00:00000:00000:2005/07/11 14:38:04.42 kernel nrpacket: recv, Connection timed out\r\n00:00000:00000:2005/07/11 14:58:59.43 kernel nrpacket: recv, Connection timed out\r\n00:00000:00157:2005/07/11 15:35:29.66 kernel Cannot read, host process disconnected: 107 1720 spid: 157\r\n00:00000:00019:2005/07/11 15:35:36.89 kernel Cannot read, host process disconnected: 640 spid: 19\r\n00:00000:00057:2005/07/11 16:16:42.44 kernel Cannot read, host process disconnected: 304 668 spid: 57\r\n00:00000:00078:2005/07/11 16:25:01.20 kernel Cannot read, host process disconnected: 302 344 spid: 78\r\n00:00000:00104:2005/07/11 16:35:53.40 kernel Cannot read, host process disconnected: 103 624 spid: 104\r\n00:00000:00168:2005/07/11 17:00:36.45 kernel Cannot read, host process disconnected: 1104 spid: 168\r\n00:00000:00026:2005/07/11 17:00:57.26 kernel Cannot read, host process disconnected: 1140 spid: 26\r\n00:00000:00066:2005/07/11 17:01:25.53 kernel Cannot read, host process disconnected: 480 spid: 66\r\n00:00000:00132:2005/07/11 17:03:37.63 kernel Cannot read, host process disconnected: 109 1764 spid: 132\r\n00:00000:00057:2005/07/11 17:19:44.78 kernel Cannot read, host process disconnected: 400 spid: 57\r\n00:00000:00026:2005/07/11 17:20:20.33 kernel Cannot read, host process disconnected: 1564 spid: 26\r\n00:00000:00073:2005/07/11 17:21:15.06 kernel Cannot read, host process disconnected: 1428 spid: 73\r\n00:00000:00160:2005/07/11 17:21:30.87 kernel Cannot read, host process disconnected: 1324 spid: 160\r\n00:00000:00069:2005/07/11 17:27:23.17 kernel Cannot read, host process disconnected: 552 spid: 69\r\n00:00000:00095:2005/07/11 18:02:47.27 kernel Cannot read, host process disconnected: 509 1144 spid: 95\r\n00:00000:00090:2005/07/11 18:32:32.59 kernel Cannot read, host process disconnected: 309 548 spid: 90\r\n00:00000:00131:2005/07/11 18:51:44.69 kernel Cannot read, host process disconnected: 302 664 spid: 131\r\n00:00000:00063:2005/07/11 18:52:40.59 kernel Cannot read, host process disconnected: 1912 spid: 63\r\n00:00000:00097:2005/07/11 19:55:59.92 kernel Cannot read, host process disconnected: 1104 spid: 97\r\n00:00000:00011:2005/07/11 21:35:05.08 kernel Cannot read, host process disconnected: 1496 spid: 11
作者:
tony8201
时间:
2005-07-13 17:54
标题:
P630宕机了,大家帮忙找一下原因。
hacmp.out 2005/07/11 21:35:05.08之后没有任何记录直到12号晚上管理员发现。\r\n\r\n硬件检测没有问题,据管理员讲电源和网络没有问题。\r\n\r\nFE2DEE00 0711234905 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET\r\n\r\n这个错误应该是IP地址重了,不知道是不是这个原因?\r\n\r\n2BFA76F6 0711234905 T S SYSPROC SYSTEM SHUTDOWN BY USER\r\n\r\n但是为什么会出现这个报告呢?这个期间谁也没有动过主机。\r\n\r\n以上是a 机出现的问题。
作者:
tony8201
时间:
2005-07-13 17:57
标题:
P630宕机了,大家帮忙找一下原因。
b机出现错误:\r\n\r\n#errpt\r\n2BFA76F6 0712190805 T S SYSPROC SYSTEM SHUTDOWN BY USER\r\n9DBCFDEE 0712200405 T O errdemon ERROR LOGGING TURNED ON\r\nAA8AB241 0712190805 T O OPERATOR OPERATOR NOTIFICATION\r\nBC3BE5A3 0712190805 P S SRC SOFTWARE PROGRAM ERROR\r\nB6048838 0712190805 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED\r\n64368504 0711235205 P O grpsvcs Connection failure between Group Service\r\n173C787F 0711235105 I S topsvcs Possible malfunction on local adapter\r\nFE2DEE00 0711235105 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET\r\nFE2DEE00 0711235105 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET
作者:
tony8201
时间:
2005-07-13 17:59
标题:
P630宕机了,大家帮忙找一下原因。
#errpt -a\r\n\r\n---------------------------------------------------------------------------\r\nLABEL: REBOOT_ID\r\nIDENTIFIER: 2BFA76F6\r\n\r\nDate/Time: Tue Jul 12 19:08:51 BEIS\r\nSequence Number: 178\r\nMachine Id: 00570A9E4C00\r\nNode Id: localhost\r\nClass: S\r\nType: TEMP\r\nResource Name: SYSPROC \r\n\r\nDescription\r\nSYSTEM SHUTDOWN BY USER\r\n\r\nProbable Causes\r\nSYSTEM SHUTDOWN\r\n\r\nDetail Data\r\nUSER ID\r\n 0\r\n0=SOFT IPL 1=HALT 2=TIME REBOOT\r\n 1\r\nTIME TO REBOOT (FOR TIMED REBOOT ONLY)\r\n 0\r\n---------------------------------------------------------------------------\r\nLABEL: ERRLOG_ON\r\nIDENTIFIER: 9DBCFDEE\r\n\r\nDate/Time: Tue Jul 12 20:04:11 BEIS\r\nSequence Number: 177\r\nMachine Id: 00570A9E4C00\r\nNode Id: localhost\r\nClass: O\r\nType: TEMP\r\nResource Name: errdemon \r\n\r\nDescription\r\nERROR LOGGING TURNED ON\r\n\r\nProbable Causes\r\nERRDEMON STARTED AUTOMATICALLY\r\n\r\nUser Causes\r\n/USR/LIB/ERRDEMON COMMAND\r\n\r\n Recommended Actions\r\n NONE\r\n\r\n---------------------------------------------------------------------------\r\nLABEL: OPMSG\r\nIDENTIFIER: AA8AB241\r\n\r\nDate/Time: Tue Jul 12 19:08:43 BEIS\r\nSequence Number: 176\r\nMachine Id: 00570A9E4C00\r\nNode Id: xbserver2\r\nClass: O\r\nType: TEMP\r\nResource Name: OPERATOR \r\n\r\nDescription\r\nOPERATOR NOTIFICATION\r\n\r\nUser Causes\r\nERRLOGGER COMMAND\r\n\r\n Recommended Actions\r\n REVIEW DETAILED DATA\r\n\r\nDetail Data\r\nMESSAGE FROM ERRLOGGER COMMAND\r\nclexit.rc : Unexpected termination of clstrmgrES\r\n---------------------------------------------------------------------------\r\nLABEL: SRC_SVKO\r\nIDENTIFIER: BC3BE5A3\r\n\r\nDate/Time: Tue Jul 12 19:08:41 BEIS\r\nSequence Number: 175\r\nMachine Id: 00570A9E4C00\r\nNode Id: xbserver2\r\nClass: S\r\nType: PERM\r\nResource Name: SRC \r\n\r\nDescription\r\nSOFTWARE PROGRAM ERROR\r\n\r\nProbable Causes\r\nAPPLICATION PROGRAM\r\n\r\nFailure Causes\r\nSOFTWARE PROGRAM\r\n\r\n Recommended Actions\r\n MANUALLY RESTART SUBSYSTEM IF NEEDED\r\n\r\nDetail Data\r\nSYMPTOM CODE\r\n 721035\r\nSOFTWARE ERROR CODE\r\n -9017\r\nERROR CODE\r\n 0\r\nDETECTING MODULE\r\n\'srchevn.c\'@line:\'350\'\r\nFAILING MODULE\r\nclstrmgrES\r\n---------------------------------------------------------------------------\r\nLABEL: CORE_DUMP\r\nIDENTIFIER: B6048838\r\n\r\nDate/Time: Tue Jul 12 19:08:41 BEIS\r\nSequence Number: 174\r\nMachine Id: 00570A9E4C00\r\nNode Id: xbserver2\r\nClass: S\r\nType: PERM\r\nResource Name: SYSPROC \r\n\r\nDescription\r\nSOFTWARE PROGRAM ABNORMALLY TERMINATED\r\n\r\nProbable Causes\r\nSOFTWARE PROGRAM\r\n\r\nUser Causes\r\nUSER GENERATED SIGNAL\r\n\r\n Recommended Actions\r\n CORRECT THEN RETRY\r\n\r\nFailure Causes\r\nSOFTWARE PROGRAM\r\n\r\n Recommended Actions\r\n RERUN THE APPLICATION PROGRAM\r\n IF PROBLEM PERSISTS THEN DO THE FOLLOWING\r\n CONTACT APPROPRIATE SERVICE REPRESENTATIVE\r\n\r\nDetail Data\r\nSIGNAL NUMBER\r\n 11\r\nUSER\'S PROCESS ID:\r\n 37274\r\nFILE SYSTEM SERIAL NUMBER\r\n 1\r\nINODE NUMBER\r\n 2\r\nPROCESSOR ID\r\n 0\r\nCORE FILE NAME\r\n/core\r\nPROGRAM NAME\r\nclstrmgr\r\nADDITIONAL INFORMATION\r\nha_gs_sen 1C0\r\nha_gs_sen 1B0\r\nUnable to generate symptom string.\r\n---------------------------------------------------------------------------\r\nLABEL: GS_TS_RETCODE_ER\r\nIDENTIFIER: 64368504\r\n\r\nDate/Time: Mon Jul 11 23:52:07 BEIS\r\nSequence Number: 173\r\nMachine Id: 00570A9E4C00\r\nNode Id: xbserver2\r\nClass: O\r\nType: PERM\r\nResource Name: grpsvcs \r\n\r\nDescription\r\nConnection failure between Group Services and Topology Services\r\n\r\nProbable Causes\r\nTopology Services daemon is not running\r\nTopology Services daemon has died\r\nTopology Services library has detected an error\r\n\r\nFailure Causes\r\nGroup Services detects an error condition of Topology Services\r\n\r\n Recommended Actions\r\n Check the Topology Services daemon\r\nVerify that Group Services daemon has been restarted\r\nCall IBM Service if problem persists\r\n\r\nDetail Data\r\nDETECTING MODULE\r\nRSCT,PMAdaptMbr.C,1.44,716 \r\nERROR ID \r\n62IcBY/bKdo0/cac132kN8....................\r\nREFERENCE CODE\r\n \r\nDIAGNOSTIC EXPLANATION\r\nReceived unknown adapter [97] for PMAdaptMbr name [allAdapter_13_0_ether] from hats.\r\n---------------------------------------------------------------------------\r\nLABEL: TS_LOC_DOWN_ST\r\nIDENTIFIER: 173C787F\r\n\r\nDate/Time: Mon Jul 11 23:51:49 BEIS\r\nSequence Number: 172\r\nMachine Id: 00570A9E4C00\r\nNode Id: xbserver2\r\nClass: S\r\nType: INFO\r\nResource Name: topsvcs \r\n\r\nDescription\r\nPossible malfunction on local adapter\r\n\r\nProbable Causes\r\nLocal adapter mal-functioned\r\nLocal adapter lost connection to network\r\nLocal adapter mis-configured\r\n\r\nFailure Causes\r\nLocal adapter mal-functioned\r\nLocal adapter lost connection to network\r\nLocal adapter mis-configured\r\n\r\n Recommended Actions\r\n Verify adapter configuration\r\n Verify network connectivity\r\n\r\nDetail Data\r\nDETECTING MODULE\r\nrsct,nim_control.C,1.38,4143 \r\nERROR ID \r\n6zV5DL.JKdo0/zzx/32kN8....................\r\nREFERENCE CODE\r\n \r\nAdapter interface name\r\ntty0\r\nAdapter offset\r\n 2\r\nAdapter IP address\r\n255.255.0.1\r\n---------------------------------------------------------------------------\r\nLABEL: AIXIF_ARP_DUP_ADDR\r\nIDENTIFIER: FE2DEE00\r\n\r\nDate/Time: Mon Jul 11 23:51:02 BEIS\r\nSequence Number: 171\r\nMachine Id: 00570A9E4C00\r\nNode Id: xbserver2\r\nClass: S\r\nType: PERM\r\nResource Name: SYSXAIXIF \r\n\r\nDescription\r\nDUPLICATE IP ADDRESS DETECTED IN THE NET\r\n\r\nFailure Causes\r\nARP RESPONSE RECEIVED FOR MY IP ADDRESS\r\n\r\n Recommended Actions\r\n CONTACT NETWORK ADMINISTRATOR\r\n\r\nDetail Data\r\nDUPLICATE IP ADDRESS\r\n0A67 0103 \r\nMAC ADDRESS\r\n000D 600B 866E \r\n---------------------------------------------------------------------------\r\nLABEL: AIXIF_ARP_DUP_ADDR\r\nIDENTIFIER: FE2DEE00\r\n\r\nDate/Time: Mon Jul 11 23:51:01 BEIS\r\nSequence Number: 170\r\nMachine Id: 00570A9E4C00\r\nNode Id: xbserver2\r\nClass: S\r\nType: PERM\r\nResource Name: SYSXAIXIF \r\n\r\nDescription\r\nDUPLICATE IP ADDRESS DETECTED IN THE NET\r\n\r\nFailure Causes\r\nARP RESPONSE RECEIVED FOR MY IP ADDRESS\r\n\r\n Recommended Actions\r\n CONTACT NETWORK ADMINISTRATOR\r\n\r\nDetail Data\r\nDUPLICATE IP ADDRESS\r\n0A67 0103 \r\nMAC ADDRESS\r\n000D 600B 866E
作者:
yanbing
时间:
2005-07-13 21:52
标题:
P630宕机了,大家帮忙找一下原因。
就是ip地址重复的问题。。。\r\n\r\nshutdown是hacmp控制的强制关机。
作者:
tony8201
时间:
2005-07-14 09:17
标题:
P630宕机了,大家帮忙找一下原因。
如果a机由于IP地址重复在11号晚上宕机。\r\n那么b机为什么会在12号晚上宕机呢?只有这几个报告:\r\n\r\n2BFA76F6 0712190805 T S SYSPROC SYSTEM SHUTDOWN BY USER \r\n9DBCFDEE 0712200405 T O errdemon ERROR LOGGING TURNED ON \r\nAA8AB241 0712190805 T O OPERATOR OPERATOR NOTIFICATION \r\nBC3BE5A3 0712190805 P S SRC SOFTWARE PROGRAM ERROR \r\nB6048838 0712190805 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED\r\n\r\n---------------------------------------------------------------------------\r\nLABEL: REBOOT_ID\r\nIDENTIFIER: 2BFA76F6\r\n\r\nDate/Time: Tue Jul 12 19:08:51 BEIS\r\nSequence Number: 178\r\nMachine Id: 00570A9E4C00\r\nNode Id: localhost\r\nClass: S\r\nType: TEMP\r\nResource Name: SYSPROC \r\n\r\nDescription\r\nSYSTEM SHUTDOWN BY USER\r\n\r\nProbable Causes\r\nSYSTEM SHUTDOWN\r\n\r\nDetail Data\r\nUSER ID\r\n 0\r\n0=SOFT IPL 1=HALT 2=TIME REBOOT\r\n 1\r\nTIME TO REBOOT (FOR TIMED REBOOT ONLY)\r\n 0\r\n---------------------------------------------------------------------------\r\nLABEL: ERRLOG_ON\r\nIDENTIFIER: 9DBCFDEE\r\n\r\nDate/Time: Tue Jul 12 20:04:11 BEIS\r\nSequence Number: 177\r\nMachine Id: 00570A9E4C00\r\nNode Id: localhost\r\nClass: O\r\nType: TEMP\r\nResource Name: errdemon \r\n\r\nDescription\r\nERROR LOGGING TURNED ON\r\n\r\nProbable Causes\r\nERRDEMON STARTED AUTOMATICALLY\r\n\r\nUser Causes\r\n/USR/LIB/ERRDEMON COMMAND\r\n\r\n Recommended Actions\r\n NONE\r\n\r\n---------------------------------------------------------------------------\r\nLABEL: OPMSG\r\nIDENTIFIER: AA8AB241\r\n\r\nDate/Time: Tue Jul 12 19:08:43 BEIS\r\nSequence Number: 176\r\nMachine Id: 00570A9E4C00\r\nNode Id: xbserver2\r\nClass: O\r\nType: TEMP\r\nResource Name: OPERATOR \r\n\r\nDescription\r\nOPERATOR NOTIFICATION\r\n\r\nUser Causes\r\nERRLOGGER COMMAND\r\n\r\n Recommended Actions\r\n REVIEW DETAILED DATA\r\n\r\nDetail Data\r\nMESSAGE FROM ERRLOGGER COMMAND\r\nclexit.rc : Unexpected termination of clstrmgrES\r\n---------------------------------------------------------------------------\r\nLABEL: SRC_SVKO\r\nIDENTIFIER: BC3BE5A3\r\n\r\nDate/Time: Tue Jul 12 19:08:41 BEIS\r\nSequence Number: 175\r\nMachine Id: 00570A9E4C00\r\nNode Id: xbserver2\r\nClass: S\r\nType: PERM\r\nResource Name: SRC \r\n\r\nDescription\r\nSOFTWARE PROGRAM ERROR\r\n\r\nProbable Causes\r\nAPPLICATION PROGRAM\r\n\r\nFailure Causes\r\nSOFTWARE PROGRAM\r\n\r\n Recommended Actions\r\n MANUALLY RESTART SUBSYSTEM IF NEEDED\r\n\r\nDetail Data\r\nSYMPTOM CODE\r\n 721035\r\nSOFTWARE ERROR CODE\r\n -9017\r\nERROR CODE\r\n 0\r\nDETECTING MODULE\r\n\'srchevn.c\'@line:\'350\'\r\nFAILING MODULE\r\nclstrmgrES\r\n---------------------------------------------------------------------------\r\nLABEL: CORE_DUMP\r\nIDENTIFIER: B6048838\r\n\r\nDate/Time: Tue Jul 12 19:08:41 BEIS\r\nSequence Number: 174\r\nMachine Id: 00570A9E4C00\r\nNode Id: xbserver2\r\nClass: S\r\nType: PERM\r\nResource Name: SYSPROC \r\n\r\nDescription\r\nSOFTWARE PROGRAM ABNORMALLY TERMINATED\r\n\r\nProbable Causes\r\nSOFTWARE PROGRAM\r\n\r\nUser Causes\r\nUSER GENERATED SIGNAL\r\n\r\n Recommended Actions\r\n CORRECT THEN RETRY\r\n\r\nFailure Causes\r\nSOFTWARE PROGRAM\r\n\r\n Recommended Actions\r\n RERUN THE APPLICATION PROGRAM\r\n IF PROBLEM PERSISTS THEN DO THE FOLLOWING\r\n CONTACT APPROPRIATE SERVICE REPRESENTATIVE\r\n\r\nDetail Data\r\nSIGNAL NUMBER\r\n 11\r\nUSER\'S PROCESS ID:\r\n 37274\r\nFILE SYSTEM SERIAL NUMBER\r\n 1\r\nINODE NUMBER\r\n 2\r\nPROCESSOR ID\r\n 0\r\nCORE FILE NAME\r\n/core\r\nPROGRAM NAME\r\nclstrmgr\r\nADDITIONAL INFORMATION\r\nha_gs_sen 1C0\r\nha_gs_sen 1B0\r\nUnable to generate symptom string.\r\n---------------------------------------------------------------------------
作者:
tony8201
时间:
2005-07-14 09:25
标题:
P630宕机了,大家帮忙找一下原因。
主机配置:2 CPU,2 G内存,FastT600盘柜。\r\n系统:AIX5.2 + ML04 + HACMP5.2 + SYBASE12.5 + C6.0
作者:
xuwelcome
时间:
2005-07-14 15:14
标题:
P630宕机了,大家帮忙找一下原因。
有个疑问\r\nLABEL: TS_LOC_DOWN_ST \r\nIDENTIFIER: 173C787F \r\n\r\nDate/Time: Mon Jul 11 23:51:49 BEIS \r\nSequence Number: 172 \r\nMachine Id: 00570A9E4C00 \r\nNode Id: xbserver2 \r\nClass: S \r\nType: INFO \r\nResource Name: topsvcs \r\n\r\nDescription \r\nPossible malfunction on local adapter \r\n\r\nProbable Causes \r\nLocal adapter mal-functioned \r\nLocal adapter lost connection to network \r\nLocal adapter mis-configured \r\n\r\nFailure Causes \r\nLocal adapter mal-functioned \r\nLocal adapter lost connection to network \r\nLocal adapter mis-configured \r\n\r\nRecommended Actions \r\nVerify adapter configuration \r\nVerify network connectivity \r\n\r\nDetail Data \r\nDETECTING MODULE \r\nrsct,nim_control.C,1.38,4143 \r\nERROR ID \r\n6zV5DL.JKdo0/zzx/32kN8.................... \r\nREFERENCE CODE \r\n \r\nAdapter interface name \r\ntty0 \r\nAdapter offset \r\n 2 \r\nAdapter IP address \r\n255.255.0.1\r\n\r\ntty怎么会有ip address 255.255.0.1,而且255.255.0.1不论是作为地址还是掩码都是不符合规范的。我是想hacmp的配置是不是有问题,可以查查。
作者:
tony8201
时间:
2005-07-14 15:16
标题:
P630宕机了,大家帮忙找一下原因。
a机mail报错:\r\n\r\nFrom root Wed Jul 13 04:06:58 2005\r\nReceived: (from root@localhost) by xbserver1_stdby (AIX5.2/8.11.6p2/8.11.0) id j6CK6vV37530 for root; Wed, 13 Jul 2005 04:06:57 +0800\r\nDate: Wed, 13 Jul 2005 04:06:57 +0800\r\nFrom: root\r\nMessage-Id: <200507122006.j6CK6vV37530@xbserver1_stdby>;\r\nTo: root\r\nSubject: diagela message from xbserver1\r\nStatus: RO\r\n\r\nA PROBLEM WAS DETECTED ON Wed Jul 13 04:05:57 BEIST 2005 801014\r\n \r\nThe Service Request Number(s)/Probable Cause(s)\r\n(causes are listed in descending order of probability):\r\n\r\n 652-880: The CEC or SPCN reported a non-critical error. Report the SRN and\r\n the following reference and physical location codes to your service\r\n provider.\r\n Error log information:\r\n Date: Tue Jul 12 20:05:38 BEIST 2005\r\n Sequence number: 181\r\n Label: SCAN_ERROR_CHRP\r\n Ref. Code: B0061406 FRU: n/a n/a \r\n\r\nFrom root Tue Jul 12 20:07:24 2005\r\nReceived: (from root@localhost) by localhost (AIX5.2/8.11.6p2/8.11.0) id j6CC7Nm05542 for root; Tue, 12 Jul 2005 20:07:23 +0800\r\nDate: Tue, 12 Jul 2005 20:07:23 +0800\r\nFrom: root\r\nMessage-Id: <200507121207.j6CC7Nm05542@localhost>;\r\nTo: root\r\nSubject: diagela message from localhost\r\nStatus: RO\r\n\r\nA PROBLEM WAS DETECTED ON Tue Jul 12 20:07:21 BEIST 2005 801014\r\n \r\nThe Service Request Number(s)/Probable Cause(s)\r\n(causes are listed in descending order of probability):\r\n\r\n 652-880: The CEC or SPCN reported a non-critical error. Report the SRN and\r\n the following reference and physical location codes to your service\r\n provider.\r\n Error log information:\r\n Date: Tue Jul 12 20:05:38 BEIST 2005\r\n Sequence number: 181\r\n Label: SCAN_ERROR_CHRP\r\n Ref. Code: B0061406 FRU: n/a n/a \r\n\r\nFrom root Tue Apr 12 14:01:00 2005\r\nReceived: (from root@localhost) by loopback (AIX5.2/8.11.6p2/8.11.0) id j3C510v42772 for root; Tue, 12 Apr 2005 14:01:00 +0900\r\nDate: Tue, 12 Apr 2005 14:01:00 +0900\r\nFrom: root\r\nMessage-Id: <200504120501.j3C510v42772@loopback>;\r\nTo: root\r\nSubject: diagela message from xbserver1\r\nStatus: RO\r\n\r\nTESTING COMPLETE on Tue Apr 12 13:24:38 BEIDT 2005 801010\r\n\r\nNo trouble was found.\r\n\r\nThe resources tested were:\r\n \r\n- proc1 U0.1-P1-C1 Processor
作者:
tony8201
时间:
2005-07-14 15:18
标题:
P630宕机了,大家帮忙找一下原因。
B0061406\r\n应该是系统微码的错误,需要升级微码吧?!
作者:
firefoxli
时间:
2005-07-15 16:17
标题:
P630宕机了,大家帮忙找一下原因。
硬件检测没有问题,据管理员讲电源和网络没有问题。 不能全信,网络为什么报IP冲突?有没有其他人动过???\r\n\r\n关注。我见过有篇文章说是\r\nhacmp环境?\r\n\r\n1.升级 bos.rte.libpthreads 的包到最新的级别。\r\n2.降低NIM failure detact rate.\r\nsmitty hacmp\r\ncluster config\r\ncluster topology\r\nconfigure Network Modules\r\nChange a Network Module using Predefined Values\r\n把rs232 和 Ethernet 的值都调慢。
作者:
liudw
时间:
2005-07-16 00:08
标题:
P630宕机了,大家帮忙找一下原因。
在双机倒换的时候会提示这种tty故障,这个不是什么特殊的
作者:
强人
时间:
2005-09-26 17:09
标题:
P630宕机了,大家帮忙找一下原因。
楼主的问题,最后解决了吗?\r\n我感觉都没谈到点子上啊?\r\n为什么都不管 错误类型是B6048838的呢?\r\n感觉是不是软件原因引起的?\r\nIP 重复不会引起core dump把?
作者:
novmcgrady
时间:
2006-04-08 10:23
最后解决了没有啊?\r\n\r\n我也遇到了同样的问题
作者:
herowangzj
时间:
2006-04-08 12:05
关注中,觉得不是IP冲突引起的,我做HA时遇到过IP冲突,就报一个IP冲突错误就完了,当然是服务IP冲突的,当应该关系不大啊,感觉是软件引起的,希望高手指点!
作者:
Joker2004
时间:
2006-04-10 17:44
应该就是IP的问题,IP冲突,可能双机的boot或者stb网卡被冲掉了,导致双机信息不能同步引起宕机了。
作者:
feiaix
时间:
2006-12-19 16:51
楼主的问题怎么解决的呀 ,我也 遇到了同样的问题了。:(\r\nBAD LUCK
作者:
Jens
时间:
2006-12-21 17:07
我以前也碰到过同样的问题,但也维护公司的人来了没能解决,两台机重启后就好了。。。
作者:
feiaix
时间:
2006-12-22 09:17
是不是心跳没配置好啊。然后I/O太大机器宕掉了。
作者:
lj_cd
时间:
2006-12-22 10:25
有问题就来问,解决了就不说,唉。。。。。。。。。。
欢迎光临 Chinaunix (http://bbs.chinaunix.net/)
Powered by Discuz! X3.2