- 论坛徽章:
- 0
|
环境:两台p650主机:AIX5200-3;盘阵 S4300(8*73G;raid5);应用:domino6.5.3(a,b机各跑2分区);hacmp5.1
现象:1.某天某domino分区应用故障,导致应用宕机,因部署了应用自定义监控脚本,故HACMP本应监控到此domino分区宕机并重启应用;但HACMP只监控到此domino分区宕机(监控脚本有输出日志),而没有执行停止\启动脚本重启此domino分区
2.此后我们对hacmp的监控进行了测试(模拟应用监控脚本中的domino分区应用宕机情况),发现各domino分区(测试了a机b机各一domino分区)的监控脚本均没有监控到宕机,监控脚本没有输出日志,domino也没有重启.
相关日志:1./tmp/clstrmgr.debug
...(从13号开始,很多以下类似日志)
Fri May 19 17:59:13 PollAliasEvents: State not STABLE/RP_RUNNING or ibcasts, return
Fri May 19 17:59:43 PollAliasEvents: State not STABLE/RP_RUNNING or ibcasts, return
Fri May 19 18:00:13 PollAliasEvents: State not STABLE/RP_RUNNING or ibcasts, return
...
2./tmp/hacmp.out
...(很早就有,很多以下类似日志)
WARNING: Cluster gsmsscluster has been running recovery program '/usr/es/sbin/cluster/events/server_restart.rp' for 8004000 seconds. Please check cluster status.
WARNING: Cluster gsmsscluster has been running recovery program '/usr/es/sbin/cluster/events/server_restart.rp' for 8007600 seconds. Please check cluster status.
WARNING: Cluster gsmsscluster has been running recovery program '/usr/es/sbin/cluster/events/server_restart.rp' for 8011200 seconds. Please check cluster status.
...
请各位老大帮忙看看,谢谢. |
|