- 论坛徽章:
- 0
|
12月17日00.17,由两台P55a组成的的HA双机热备的备机又当机了,之前11月15日也试过当机,请大虾们帮忙诊断一下,究竟当机的原因是不是同一个,问题究竟出在什么地方了,这已经第三次出现这个问题得了,9月15日同样出现个这个问题,谢谢帮忙,不胜感激!!!
备机的HA无任何报错信息!!!!
附:上次当机信息http://bbs.chinaunix.net/thread-1313204-1-1.html
以下十本次当机errpt -a信息。
---------------------------------------------------------------------------
LABEL: GS_START_ST
IDENTIFIER: AFA89905
Date/Time: Sat Dec 27 11:10:43 BEIST 2008
Sequence Number: 424
Machine Id: 0006B665D600
Node Id: p55b
Class: O
Type: INFO
Resource Name: grpsvcs
Description
Group Services daemon started
Probable Causes
Daemon started during system startup
Daemon re-started automatically by SRC
Daemon started during installation
Daemon started manually by user
User Causes
Daemon started manually by user
Recommended Actions
Check that Group Services daemon is running
Detail Data
DETECTING MODULE
RSCT,pgsd.C,1.62.1.8,606
ERROR ID
63Y7ej0nmNJ7/4J800...8....................
REFERENCE CODE
DIAGNOSTIC EXPLANATION
HAGS daemon started by SRC. Log file is /var/ha/log/grpsvcs_trace_2_20.
---------------------------------------------------------------------------
LABEL: TS_START_ST
IDENTIFIER: 97419D60
Date/Time: Sat Dec 27 11:10:39 BEIST 2008
Sequence Number: 423
Machine Id: 0006B665D600
Node Id: p55b
Class: O
Type: INFO
Resource Name: topsvcs
Description
Topology Services daemon started
Probable Causes
Daemon started during system start-up
Daemon re-started automatically by SRC
Daemon started during installation
Daemon started manually by user
User Causes
Daemon started manually by user
Recommended Actions
Confirm that this is desirable
Detail Data
DETECTING MODULE
rsct,bootstrp.C,1.211,4459
ERROR ID
6UpNEL0jmNJ7/r26.0...8....................
REFERENCE CODE
Topology Services daemon started by:
SRC
Topology Services daemon log file location
/var/ha/log/topsvcs.27.111038.btpclus.en_US
Topology Services daemon run directory
/var/ha/run/topsvcs.btpclus/
---------------------------------------------------------------------------
LABEL: RMCD_INFO_0_ST
IDENTIFIER: A6DF45AA
Date/Time: Sat Dec 27 10:55:55 BEIST 2008
Sequence Number: 422
Machine Id: 0006B665D600
Node Id: p55b
Class: O
Type: INFO
Resource Name: RMCdaemon
Description
The daemon is started.
Probable Causes
The Resource Monitoring and Control daemon has been started.
User Causes
The startsrc -s ctrmc command has been executed or
the rmcctrl -s command has been executed.
Recommended Actions
Confirm that the daemon should be started.
Detail Data
DETECTING MODULE
RSCT,rmcd.c,1.52,211
ERROR ID
6eKora0vYNJ7/mZ4/0...8....................
REFERENCE CODE
---------------------------------------------------------------------------
LABEL: REBOOT_ID
IDENTIFIER: 2BFA76F6
Date/Time: Sat Dec 27 10:54:22 BEIST 2008
Sequence Number: 420
Machine Id: 0006B665D600
Node Id: localhost
Class: S
Type: TEMP
Resource Name: SYSPROC
Description
SYSTEM SHUTDOWN BY USER
Probable Causes
SYSTEM SHUTDOWN
Detail Data
USER ID
0
0=SOFT IPL 1=HALT 2=TIME REBOOT
1
TIME TO REBOOT (FOR TIMED REBOOT ONLY)
0
---------------------------------------------------------------------------
LABEL: ERRLOG_ON
IDENTIFIER: 9DBCFDEE
Date/Time: Sat Dec 27 10:55:36 BEIST 2008
Sequence Number: 419
Machine Id: 0006B665D600
Node Id: localhost
Class: O
Type: TEMP
Resource Name: errdemon
Description
ERROR LOGGING TURNED ON
Probable Causes
ERRDEMON STARTED AUTOMATICALLY
User Causes
/USR/LIB/ERRDEMON COMMAND
Recommended Actions
NONE
---------------------------------------------------------------------------
LABEL: TS_STOP_ST
IDENTIFIER: 6D19271E
Date/Time: Sat Dec 27 00:17:34 BEIST 2008
Sequence Number: 418
Machine Id: 0006B665D600
Node Id: p55b
Class: O
Type: INFO
Resource Name: topsvcs
Description
Topology Services daemon stopped
Probable Causes
Daemon stopped by SRC
Daemon stopped by signal
User Causes
Daemon stopped by user
Recommended Actions
Confirm that this is desirable
Detail Data
DETECTING MODULE
rsct,comm.C,1.147,634
ERROR ID
6SQG4h/SCEJ7/cVe00...8....................
REFERENCE CODE
6UpNEL0wrq57/oBg/0...8....................
Topology Services daemon stopped by:
Signal SIGTERM
---------------------------------------------------------------------------
LABEL: OPMSG
IDENTIFIER: AA8AB241
Date/Time: Sat Dec 27 00:17:33 BEIST 2008
Sequence Number: 417
Machine Id: 0006B665D600
Node Id: p55b
Class: O
Type: TEMP
Resource Name: OPERATOR
Description
OPERATOR NOTIFICATION
User Causes
ERRLOGGER COMMAND
Recommended Actions
REVIEW DETAILED DATA
Detail Data
MESSAGE FROM ERRLOGGER COMMAND
clexit.rc : Unexpected termination of clstrmgrES
---------------------------------------------------------------------------
LABEL: SRC_SVKO
IDENTIFIER: BC3BE5A3
Date/Time: Sat Dec 27 00:17:33 BEIST 2008
Sequence Number: 416
Machine Id: 0006B665D600
Node Id: p55b
Class: S
Type: PERM
Resource Name: SRC
Description
SOFTWARE PROGRAM ERROR
Probable Causes
APPLICATION PROGRAM
Failure Causes
SOFTWARE PROGRAM
Recommended Actions
MANUALLY RESTART SUBSYSTEM IF NEEDED
Detail Data
SYMPTOM CODE
1024
SOFTWARE ERROR CODE
-9017
ERROR CODE
0
DETECTING MODULE
'srchevn.c'@line:'350'
FAILING MODULE
clstrmgrES
---------------------------------------------------------------------------
LABEL: SRC_RSTRT
IDENTIFIER: BA431EB7
Date/Time: Sat Dec 27 00:17:33 BEIST 2008
Sequence Number: 415
Machine Id: 0006B665D600
Node Id: p55b
Class: S
Type: PERM
Resource Name: SRC
Description
SOFTWARE PROGRAM ERROR
Probable Causes
APPLICATION PROGRAM
Failure Causes
SOFTWARE PROGRAM
Recommended Actions
VERIFY SUBSYSTEM RESTARTED AUTOMATICALLY
Detail Data
SYMPTOM CODE
0
SOFTWARE ERROR CODE
-9035
ERROR CODE
0
DETECTING MODULE
'srchevn.c'@line:'217'
FAILING MODULE
emsvcs
---------------------------------------------------------------------------
LABEL: SRC_SVKO
IDENTIFIER: BC3BE5A3
Date/Time: Sat Dec 27 00:17:33 BEIST 2008
Sequence Number: 414
Machine Id: 0006B665D600
Node Id: p55b
Class: S
Type: PERM
Resource Name: SRC
Description
SOFTWARE PROGRAM ERROR
Probable Causes
APPLICATION PROGRAM
Failure Causes
SOFTWARE PROGRAM
Recommended Actions
MANUALLY RESTART SUBSYSTEM IF NEEDED
Detail Data
SYMPTOM CODE
3840
SOFTWARE ERROR CODE
-9017
ERROR CODE
0
DETECTING MODULE
'srchevn.c'@line:'350'
FAILING MODULE
grpsvcs
---------------------------------------------------------------------------
LABEL: HA002_ER
IDENTIFIER: 12081DC6
Date/Time: Sat Dec 27 00:17:33 BEIST 2008
Sequence Number: 413
Machine Id: 0006B665D600
Node Id: p55b
Class: S
Type: PERM
Resource Name: haemd
Description
SOFTWARE PROGRAM ERROR
Probable Causes
SUBSYSTEM
Failure Causes
SUBSYSTEM
Recommended Actions
REPORT DETAILED DATA
CONTACT APPROPRIATE SERVICE REPRESENTATIVE
Detail Data
DETECTING MODULE
LPP=PSSP,Fn=emd_gsi.c,SID=1.4.1.37,L#=1395,
DIAGNOSTIC EXPLANATION
haemd: 2521-032 Cannot dispatch group services (1).
---------------------------------------------------------------------------
LABEL: GS_XSTALE_PRCLM_ER
IDENTIFIER: 657D8FFA
Date/Time: Sat Dec 27 00:17:32 BEIST 2008
Sequence Number: 412
Machine Id: 0006B665D600
Node Id: p55b
Class: O
Type: PERM
Resource Name: grpsvcs
Description
Group Services daemon exit to re-join the domain
Probable Causes
Topology Services daemon reports inconsistent node down and up events
Failure Causes
Network has been a temporal problem
Recommended Actions
Verify that Group Services daemon has been restarted
Call IBM Service if problem persists
Detail Data
DETECTING MODULE
RSCT,NsMsg.C,1.77,1070
ERROR ID
6uzMTZ/QCEJ7/QZf10...8....................
REFERENCE CODE
DIAGNOSTIC EXPLANATION
Got a non-stale Proclaim message from my NS(domId=1.45). He must have deleted me, so I'm exiting to
[ 本帖最后由 popoq 于 2008-12-27 14:22 编辑 ] |
|