- 论坛徽章:
- 0
|
找了很久,在一个网站(http://sources.redhat.com/cluster/faq.html )上找到了关于2节点的cluster中,节点间失去联系后的情况说明:
“When each node recognizes that the other has stopped responding, it will try to fence the other. It can be like a gunfight at the O.K. Coral, and the node that's quickest on the draw (first to fence the other) wins. Unfortunately, both nodes can end up going down simultaneously, losing the whole cluster.
It's possible to avoid this by using a network power switch that serializes the two fencing operations. That ensures that one node is rebooted and the second never fences the first. ”
“Strangely, if you have a persistent network problem and the fencing device is still accessible to both nodes, this can result in a "A reboots B, B reboots A" fencing loop.
This problem can be gotten around by using a quorum disk or partition to break the tie.”
谢谢,nntp版主!
现在我的情况是没有Power switch,版本也是RHCS U2。
准备去试试,先将心跳信息用另外一个网口-eth0,IPMI LAN传递(http://sources.redhat.com/cluster/faq.html#cman_heartbeat_nic),就是不知道作为fence设备的IPMI LAN是否可行?
如果还不行的话,就升级RHCS到U4,配置quorum disk 再试试。
丫丫的,RHCS U2真是烂!
也怪自己没有了解充分,早点用U4、用quorum disk 可能会省去很多麻烦。 |
|