免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 7584 | 回复: 2

求助,heartbeat配置的高可靠环境中,一个节点所有进程被kill了 [复制链接]

论坛徽章:
0
发表于 2011-07-07 11:25 |显示全部楼层
以下为ha-log日志文件中最后的一段日志。希望高手帮忙分析下,是什么原因导致的。(heartbeat的RPM包为heartbeat-2.1.3-0.9)

heartbeat[3828]: 2011/06/27_11:32:29 WARN: nodename rpd136-server-1 uuid changed to rpd-server-1
heartbeat[3828]: 2011/06/27_11:32:29 WARN: nodename rpd-server-1 uuid changed to rpd136-server-1
heartbeat[3828]: 2011/06/27_11:32:30 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[3828]: 2011/06/27_11:32:30 ERROR: should_drop_message: attempted replay attack [rpd136-server-1]? [gen = 1291791681, curgen = 1307790052]
heartbeat[3828]: 2011/06/27_11:32:31 WARN: nodename rpd136-server-1 uuid changed to rpd-server-1
heartbeat[3828]: 2011/06/27_11:32:31 ERROR: HBDoMsg_T_ACK: corrupted ackseq current hiseq = 73956 ackseq =587418 in this message
heartbeat[3828]: 2011/06/27_11:32:31 WARN: nodename rpd-server-1 uuid changed to rpd136-server-1
heartbeat[3828]: 2011/06/27_11:32:32 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[3828]: 2011/06/27_11:32:32 ERROR: should_drop_message: attempted replay attack [rpd136-server-1]? [gen = 1291791681, curgen = 1307790052]
heartbeat[3828]: 2011/06/27_11:32:33 WARN: nodename rpd136-server-1 uuid changed to rpd-server-1
heartbeat[3828]: 2011/06/27_11:32:33 WARN: nodename rpd-server-1 uuid changed to rpd136-server-1
heartbeat[3828]: 2011/06/27_11:32:34 ERROR: Message hist queue is filling up (500 messages in queue)
heartbeat[3828]: 2011/06/27_11:32:34 ERROR: should_drop_message: attempted replay attack [rpd136-server-1]? [gen = 1291791681, curgen = 1307790052]
heartbeat[3828]: 2011/06/27_11:32:35 WARN: nodename rpd136-server-1 uuid changed to rpd-server-1
heartbeat[3828]: 2011/06/27_11:32:35 WARN: nodename rpd-server-1 uuid changed to rpd136-server-1
heartbeat[3828]: 2011/06/27_11:32:36 CRIT: Emergency Shutdown: Attempting to kill everything ourselves
heartbeat[3828]: 2011/06/27_11:32:36 info: killing /usr/lib64/heartbeat/ccm process group 3841 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing /usr/lib64/heartbeat/cib process group 3842 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing /usr/lib64/heartbeat/lrmd -r process group 3843 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing /usr/lib64/heartbeat/stonithd process group 3844 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing /usr/lib64/heartbeat/attrd process group 3845 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing /usr/lib64/heartbeat/crmd process group 3846 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing HBFIFO process 3831 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing HBWRITE process 3832 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing HBREAD process 3833 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing HBWRITE process 3834 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing HBREAD process 3835 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing HBWRITE process 3836 with signal 9
heartbeat[3828]: 2011/06/27_11:32:36 info: killing HBREAD process 3837 with signal 9

论坛徽章:
0
发表于 2011-08-05 09:23 |显示全部楼层
查看下2台机器的uuid是否一致,不一致可以使用uuidgen命令执行的结果写入到2台主机的/etc/vx/.uuids/clusuuid文件中,最后重启ha的服务。

论坛徽章:
0
发表于 2013-09-13 15:53 |显示全部楼层
我擦,把所有网卡的 UUID字段注释掉,竟再也不报这个错误了
我的heartheart2.0.8

回复 2# a774050174


   
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP