- 论坛徽章:
- 0
|
环境:两台p650主机:AIX5200-3;盘阵 S4300(8*73G;raid5);应用:domino6.5.3(a,b机各跑2分区);hacmp5.1\r\n现象:1.某天某domino分区应用故障,导致应用宕机,因部署了应用自定义监控脚本,故HACMP本应监控到此domino分区宕机并重启应用;但HACMP只监控到此domino分区宕机(监控脚本有输出日志),而没有执行停止\\启动脚本重启此domino分区\r\n2.此后我们对hacmp的监控进行了测试(模拟应用监控脚本中的domino分区应用宕机情况),发现各domino分区(测试了a机b机各一domino分区)的监控脚本均没有监控到宕机,监控脚本没有输出日志,domino也没有重启.\r\n相关日志:1./tmp/clstrmgr.debug\r\n...(从13号开始,很多以下类似日志)\r\nFri May 19 17:59:13 PollAliasEvents: State not STABLE/RP_RUNNING or ibcasts, return\r\nFri May 19 17:59:43 PollAliasEvents: State not STABLE/RP_RUNNING or ibcasts, return\r\nFri May 19 18:00:13 PollAliasEvents: State not STABLE/RP_RUNNING or ibcasts, return\r\n...\r\n2./tmp/hacmp.out\r\n...(很早就有,很多以下类似日志)\r\nWARNING: Cluster gsmsscluster has been running recovery program \'/usr/es/sbin/cluster/events/server_restart.rp\' for 8004000 seconds. Please check cluster status.\r\nWARNING: Cluster gsmsscluster has been running recovery program \'/usr/es/sbin/cluster/events/server_restart.rp\' for 8007600 seconds. Please check cluster status.\r\nWARNING: Cluster gsmsscluster has been running recovery program \'/usr/es/sbin/cluster/events/server_restart.rp\' for 8011200 seconds. Please check cluster status.\r\n...\r\n\r\n请各位老大帮忙看看,谢谢. |
|