免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 4964 | 回复: 4
打印 上一主题 下一主题

heartbeat服务自动重启 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2014-06-19 09:59 |只看该作者 |倒序浏览
配置完linux ha的后想测试一下ha切换,
环境是REDHAT 6.4,安装heartbeat2.14
ha环境为vsphere环境下2台虚拟机,由于添加不了串口,使用IP 单播心跳。
网络配置如下:
vsyslog1 eth0 172.16.5.242
vsyslog2  eth0 172.16.5.243

ha.d/haresources
vsyslog1 172.16.5.152/24/eth0
vsyslog2 172.16.5.153/24/eth0

两台机器Heartbeat服务启动后,将VIP2个都在1号机上,将VIP153切至2号机后,断开2号机的虚拟网络,VIP自动切回1号机。
问题是,2号机恢复网络后,不知为什么,2台机器的heartbeat服务都重启了,重启后VIP2又切回2号机了。
明明配置auto_failback off  了好不好!!
######################################
[root@vsyslog1 ha.d]# cat ha.cf
debugfile /var/log/halog/ha-debug

logfile /var/log/halog/ha-log

logfacility     local0

keepalive 2

deadtime 20

warntime 5

initdead 120

udpport 694

ucast eth0 172.16.5.243

auto_failback off  

node vsyslog1

node vsyslog2

ping 172.16.5.30

hopfudge 1

deadping 5


######################################

当时的log:
将vip153切到2号机
heartbeat[7797]: 2014/06/19_09:17:58 info: vsyslog1 wants to go standby [foreign]
heartbeat[7797]: 2014/06/19_09:17:59 info: standby: vsyslog2 can take our foreign resources
heartbeat[8548]: 2014/06/19_09:17:59 info: give up foreign HA resources (standby).
ResourceManager[8561]:  2014/06/19_09:17:59 info: Releasing resource group: vsyslog2 172.16.5.153/24/eth0
ResourceManager[8561]:  2014/06/19_09:17:59 info: Running /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 stop
ResourceManager[8561]:  2014/06/19_09:17:59 debug: Starting /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 stop
In IP Stop
SIOCDELRT: No such process
IPaddr[8630]:   2014/06/19_09:17:59 INFO: ifconfig eth0:0 down
IPaddr[8601]:   2014/06/19_09:17:59 INFO:  Success
INFO:  Success
ResourceManager[8561]:  2014/06/19_09:17:59 debug: /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 stop done. RC=0
heartbeat[8548]: 2014/06/19_09:17:59 info: foreign HA resource release completed (standby).
heartbeat[7797]: 2014/06/19_09:17:59 info: Local standby process completed [foreign].
heartbeat[7797]: 2014/06/19_09:18:00 WARN: 1 lost packet(s) for [vsyslog2] [46:48]
heartbeat[7797]: 2014/06/19_09:18:00 info: remote resource transition completed.
heartbeat[7797]: 2014/06/19_09:18:00 info: No pkts missing from vsyslog2!
heartbeat[7797]: 2014/06/19_09:18:00 info: Other node completed standby takeover of foreign resources.
heartbeat[7797]: 2014/06/19_09:19:05 info: vsyslog2 wants to go standby [local]
heartbeat[7797]: 2014/06/19_09:19:06 info: standby: acquire [local] resources from vsyslog2
heartbeat[8663]: 2014/06/19_09:19:06 info: acquire foreign HA resources (standby).
ResourceManager[8676]:  2014/06/19_09:19:06 info: Acquiring resource group: vsyslog2 172.16.5.153/24/eth0
IPaddr[8703]:   2014/06/19_09:19:06 INFO:  Resource is stopped
ResourceManager[8676]:  2014/06/19_09:19:06 info: Running /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 start
ResourceManager[8676]:  2014/06/19_09:19:06 debug: Starting /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 start
IPaddr[8803]:   2014/06/19_09:19:06 INFO: Using calculated netmask for 172.16.5.153: 255.255.255.0
IPaddr[8803]:   2014/06/19_09:19:06 DEBUG: Using calculated broadcast for 172.16.5.153: 172.16.5.255
IPaddr[8803]:   2014/06/19_09:19:06 INFO: eval ifconfig eth0:0 172.16.5.153 netmask 255.255.255.0 broadcast 172.16.5.255
IPaddr[8803]:   2014/06/19_09:19:06 DEBUG: Sending Gratuitous Arp for 172.16.5.153 on eth0:0 [eth0]
IPaddr[8774]:   2014/06/19_09:19:06 INFO:  Success
INFO:  Success
ResourceManager[8676]:  2014/06/19_09:19:06 debug: /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 start done. RC=0
heartbeat[8663]: 2014/06/19_09:19:06 info: foreign HA resource acquisition completed (standby).
heartbeat[7797]: 2014/06/19_09:19:06 info: Standby resource acquisition done [local].
heartbeat[7797]: 2014/06/19_09:19:07 info: remote resource transition completed.
断开2号机网络
heartbeat[7797]: 2014/06/19_09:22:00 info: Link vsyslog2:eth0 dead.
heartbeat[7797]: 2014/06/19_09:22:14 WARN: node vsyslog2: is dead
heartbeat[7797]: 2014/06/19_09:22:14 info: Dead node vsyslog2 gave up resources.
heartbeat[7797]: 2014/06/19_09:22:23 CRIT: Cluster node vsyslog2 returning after partition.
heartbeat[7797]: 2014/06/19_09:22:23 info: For information on cluster partitions, See URL: http://linux-ha.org/SplitBrain
heartbeat[7797]: 2014/06/19_09:22:23 WARN: Deadtime value may be too small.
heartbeat[7797]: 2014/06/19_09:22:23 info: See FAQ for information on tuning deadtime.
heartbeat[7797]: 2014/06/19_09:22:23 info: URL: http://linux-ha.org/FAQ#heavy_load
恢复2号机网络后,服务重启了
heartbeat[7797]: 2014/06/19_09:22:23 info: Link vsyslog2:eth0 up.
heartbeat[7797]: 2014/06/19_09:22:23 WARN: Late heartbeat: Node vsyslog2: interval 29010 ms
heartbeat[7797]: 2014/06/19_09:22:23 info: Status update for node vsyslog2: status active
heartbeat[8920]: 2014/06/19_09:22:23 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[8920]:     2014/06/19_09:22:23 info: Running /usr/local/etc/ha.d/rc.d/status status
heartbeat[7797]: 2014/06/19_09:22:25 info: Received shutdown notice from 'vsyslog2'.
heartbeat[7797]: 2014/06/19_09:22:25 info: Resources being acquired from vsyslog2.
heartbeat[7797]: 2014/06/19_09:22:25 WARN: Shutdown delayed until current resource activity finishes.
heartbeat[8936]: 2014/06/19_09:22:25 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[8936]:     2014/06/19_09:22:25 info: Running /usr/local/etc/ha.d/rc.d/status status
mach_down[8955]:        2014/06/19_09:22:25 info: Taking over resource group 172.16.5.153/24/eth0
ResourceManager[9004]:  2014/06/19_09:22:25 info: Acquiring resource group: vsyslog2 172.16.5.153/24/eth0
IPaddr[9026]:   2014/06/19_09:22:26 INFO:  Running OK
heartbeat[8937]: 2014/06/19_09:22:26 info: Local Resource acquisition completed.
heartbeat[7797]: 2014/06/19_09:22:26 debug: StartNextRemoteRscReq(): child count 1
IPaddr[9066]:   2014/06/19_09:22:26 INFO:  Running OK
mach_down[8955]:        2014/06/19_09:22:26 info: /usr/local/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down[8955]:        2014/06/19_09:22:26 info: mach_down takeover complete for node vsyslog2.
heartbeat[7797]: 2014/06/19_09:22:26 info: mach_down takeover complete.
heartbeat[7797]: 2014/06/19_09:22:26 info: Heartbeat shutdown in progress. (7797)
heartbeat[9162]: 2014/06/19_09:22:26 info: Giving up all HA resources.
ResourceManager[9175]:  2014/06/19_09:22:26 info: Releasing resource group: vsyslog1 172.16.5.152/24/eth0
ResourceManager[9175]:  2014/06/19_09:22:26 info: Running /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.152/24/eth0 stop
ResourceManager[9175]:  2014/06/19_09:22:26 debug: Starting /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.152/24/eth0 stop
In IP Stop
SIOCDELRT: No such process
IPaddr[9244]:   2014/06/19_09:22:26 INFO: ifconfig eth0:1 down
IPaddr[9215]:   2014/06/19_09:22:26 INFO:  Success
INFO:  Success
ResourceManager[9175]:  2014/06/19_09:22:26 debug: /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.152/24/eth0 stop done. RC=0
ResourceManager[9274]:  2014/06/19_09:22:26 info: Releasing resource group: vsyslog2 172.16.5.153/24/eth0
ResourceManager[9274]:  2014/06/19_09:22:26 info: Running /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 stop
ResourceManager[9274]:  2014/06/19_09:22:26 debug: Starting /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 stop
In IP Stop
SIOCDELRT: No such process
IPaddr[9343]:   2014/06/19_09:22:26 INFO: ifconfig eth0:0 down
IPaddr[9314]:   2014/06/19_09:22:26 INFO:  Success
INFO:  Success
ResourceManager[9274]:  2014/06/19_09:22:26 debug: /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 stop done. RC=0
heartbeat[9162]: 2014/06/19_09:22:26 info: All HA resources relinquished.
heartbeat[7797]: 2014/06/19_09:22:28 info: killing HBFIFO process 7801 with signal 15
heartbeat[7797]: 2014/06/19_09:22:28 info: killing HBWRITE process 7802 with signal 15
heartbeat[7797]: 2014/06/19_09:22:28 info: killing HBREAD process 7803 with signal 15
heartbeat[7797]: 2014/06/19_09:22:28 info: killing HBREAD process 7805 with signal 15
heartbeat[7797]: 2014/06/19_09:22:28 info: killing HBWRITE process 7804 with signal 15
heartbeat[7797]: 2014/06/19_09:22:28 info: Core process 7802 exited. 5 remaining
heartbeat[7797]: 2014/06/19_09:22:28 info: Core process 7805 exited. 4 remaining
heartbeat[7797]: 2014/06/19_09:22:28 info: Core process 7803 exited. 3 remaining
heartbeat[7797]: 2014/06/19_09:22:28 info: Core process 7804 exited. 2 remaining
heartbeat[7797]: 2014/06/19_09:22:28 info: Core process 7801 exited. 1 remaining
heartbeat[7797]: 2014/06/19_09:22:28 info: vsyslog1 Heartbeat shutdown complete.
heartbeat[7797]: 2014/06/19_09:22:28 info: Heartbeat restart triggered.
heartbeat[7797]: 2014/06/19_09:22:28 info: Restarting heartbeat.
heartbeat[7797]: 2014/06/19_09:22:28 info: Performing heartbeat restart exec.
heartbeat[7797]: 2014/06/19_09:22:49 info: Version 2 support: false
heartbeat[7797]: 2014/06/19_09:22:49 WARN: Logging daemon is disabled --enabling logging daemon is recommended
heartbeat[7797]: 2014/06/19_09:22:49 info: **************************
heartbeat[7797]: 2014/06/19_09:22:49 info: Configuration validated. Starting heartbeat 2.1.4
heartbeat[9374]: 2014/06/19_09:22:49 info: heartbeat: version 2.1.4
heartbeat[9374]: 2014/06/19_09:22:49 info: Heartbeat generation: 1402891661
heartbeat[9374]: 2014/06/19_09:22:49 info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth0
heartbeat[9374]: 2014/06/19_09:22:49 info: glib: ucast: bound send socket to device: eth0
heartbeat[9374]: 2014/06/19_09:22:49 info: glib: ucast: bound receive socket to device: eth0
heartbeat[9374]: 2014/06/19_09:22:49 info: glib: ucast: started on port 694 interface eth0 to 172.16.5.243
heartbeat[9374]: 2014/06/19_09:22:49 info: glib: ping heartbeat started.
heartbeat[9374]: 2014/06/19_09:22:49 info: G_main_add_TriggerHandler: Added signal manual handler
heartbeat[9374]: 2014/06/19_09:22:49 info: G_main_add_TriggerHandler: Added signal manual handler
heartbeat[9374]: 2014/06/19_09:22:49 info: G_main_add_SignalHandler: Added signal handler for signal 17
heartbeat[9374]: 2014/06/19_09:22:49 info: Local status now set to: 'up'
heartbeat[9374]: 2014/06/19_09:22:49 info: Link 172.16.5.30:172.16.5.30 up.
heartbeat[9374]: 2014/06/19_09:22:49 info: Status update for node 172.16.5.30: status ping
heartbeat[9374]: 2014/06/19_09:22:49 info: Link vsyslog2:eth0 up.
heartbeat[9374]: 2014/06/19_09:22:49 debug: get_delnodelist: delnodelist=
heartbeat[9374]: 2014/06/19_09:22:50 info: Status update for node vsyslog2: status active
heartbeat[9374]: 2014/06/19_09:22:50 info: Comm_now_up(): updating status to active
heartbeat[9374]: 2014/06/19_09:22:50 info: Local status now set to: 'active'
heartbeat[9384]: 2014/06/19_09:22:50 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[9384]:     2014/06/19_09:22:50 info: Running /usr/local/etc/ha.d/rc.d/status status
heartbeat[9374]: 2014/06/19_09:23:00 info: local resource transition completed.
heartbeat[9374]: 2014/06/19_09:23:00 info: Initial resource acquisition complete (T_RESOURCES(us))
IPaddr[9438]:   2014/06/19_09:23:00 INFO:  Resource is stopped
heartbeat[9402]: 2014/06/19_09:23:00 info: Local Resource acquisition completed.
heartbeat[9374]: 2014/06/19_09:23:00 debug: StartNextRemoteRscReq(): child count 1
heartbeat[9489]: 2014/06/19_09:23:00 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[9489]:     2014/06/19_09:23:00 info: Running /usr/local/etc/ha.d/rc.d/ip-request-resp ip-request-resp
ip-request-resp[9489]:  2014/06/19_09:23:00 received ip-request-resp 172.16.5.152/24/eth0 OK yes
ResourceManager[9510]:  2014/06/19_09:23:00 info: Acquiring resource group: vsyslog1 172.16.5.152/24/eth0
IPaddr[9537]:   2014/06/19_09:23:00 INFO:  Resource is stopped
ResourceManager[9510]:  2014/06/19_09:23:00 info: Running /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.152/24/eth0 start
ResourceManager[9510]:  2014/06/19_09:23:00 debug: Starting /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.152/24/eth0 start
IPaddr[9637]:   2014/06/19_09:23:00 INFO: Using calculated netmask for 172.16.5.152: 255.255.255.0
IPaddr[9637]:   2014/06/19_09:23:00 DEBUG: Using calculated broadcast for 172.16.5.152: 172.16.5.255
IPaddr[9637]:   2014/06/19_09:23:00 INFO: eval ifconfig eth0:0 172.16.5.152 netmask 255.255.255.0 broadcast 172.16.5.255
IPaddr[9637]:   2014/06/19_09:23:00 DEBUG: Sending Gratuitous Arp for 172.16.5.152 on eth0:0 [eth0]
IPaddr[9608]:   2014/06/19_09:23:00 INFO:  Success
INFO:  Success
ResourceManager[9510]:  2014/06/19_09:23:00 debug: /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.152/24/eth0 start done. RC=0
heartbeat[9374]: 2014/06/19_09:23:00 info: remote resource transition completed.
heartbeat[9374]: 2014/06/19_09:26:24 info: Received shutdown notice from 'vsyslog2'.
heartbeat[9374]: 2014/06/19_09:26:24 info: Resources being acquired from vsyslog2.
heartbeat[9374]: 2014/06/19_09:26:24 debug: StartNextRemoteRscReq(): child count 1
heartbeat[9740]: 2014/06/19_09:26:24 info: acquire local HA resources (standby).
ResourceManager[9766]:  2014/06/19_09:26:24 info: Acquiring resource group: vsyslog1 172.16.5.152/24/eth0
IPaddr[9817]:   2014/06/19_09:26:24 INFO:  Running OK
IPaddr[9816]:   2014/06/19_09:26:24 INFO:  Running OK
heartbeat[9740]: 2014/06/19_09:26:24 info: local HA resource acquisition completed (standby).
heartbeat[9741]: 2014/06/19_09:26:24 info: Local Resource acquisition completed.
heartbeat[9374]: 2014/06/19_09:26:24 info: Standby resource acquisition done [all].
heartbeat[9374]: 2014/06/19_09:26:24 debug: StartNextRemoteRscReq(): child count 1
heartbeat[9924]: 2014/06/19_09:26:24 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[9924]:     2014/06/19_09:26:24 info: Running /usr/local/etc/ha.d/rc.d/status status
mach_down[9940]:        2014/06/19_09:26:24 info: Taking over resource group 172.16.5.153/24/eth0
ResourceManager[9966]:  2014/06/19_09:26:24 info: Acquiring resource group: vsyslog2 172.16.5.153/24/eth0
IPaddr[9993]:   2014/06/19_09:26:24 INFO:  Resource is stopped
ResourceManager[9966]:  2014/06/19_09:26:24 info: Running /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 start
ResourceManager[9966]:  2014/06/19_09:26:24 debug: Starting /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 start
IPaddr[10093]:  2014/06/19_09:26:25 INFO: Using calculated netmask for 172.16.5.153: 255.255.255.0
IPaddr[10093]:  2014/06/19_09:26:25 DEBUG: Using calculated broadcast for 172.16.5.153: 172.16.5.255
IPaddr[10093]:  2014/06/19_09:26:25 INFO: eval ifconfig eth0:1 172.16.5.153 netmask 255.255.255.0 broadcast 172.16.5.255
IPaddr[10093]:  2014/06/19_09:26:25 DEBUG: Sending Gratuitous Arp for 172.16.5.153 on eth0:1 [eth0]
IPaddr[10064]:  2014/06/19_09:26:25 INFO:  Success
INFO:  Success
ResourceManager[9966]:  2014/06/19_09:26:25 debug: /usr/local/etc/ha.d/resource.d/IPaddr 172.16.5.153/24/eth0 start done. RC=0
mach_down[9940]:        2014/06/19_09:26:25 info: /usr/local/share/heartbeat/mach_down: nice_failback: foreign resources acquired
mach_down[9940]:        2014/06/19_09:26:25 info: mach_down takeover complete for node vsyslog2.
heartbeat[9374]: 2014/06/19_09:26:25 info: mach_down takeover complete.
heartbeat[9374]: 2014/06/19_09:26:31 info: Link vsyslog2:eth0 dead.
heartbeat[9374]: 2014/06/19_09:26:45 WARN: node vsyslog2: is dead
heartbeat[9374]: 2014/06/19_09:26:45 info: Dead node vsyslog2 gave up resources.
heartbeat[9374]: 2014/06/19_09:26:49 info: Heartbeat restart on node vsyslog2
heartbeat[9374]: 2014/06/19_09:26:49 info: Link vsyslog2:eth0 up.
heartbeat[9374]: 2014/06/19_09:26:49 info: Status update for node vsyslog2: status init
heartbeat[9374]: 2014/06/19_09:26:49 info: Status update for node vsyslog2: status up
heartbeat[9374]: 2014/06/19_09:26:49 debug: StartNextRemoteRscReq(): child count 1
heartbeat[10202]: 2014/06/19_09:26:49 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[10202]:    2014/06/19_09:26:49 info: Running /usr/local/etc/ha.d/rc.d/status status
heartbeat[10218]: 2014/06/19_09:26:49 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[10218]:    2014/06/19_09:26:49 info: Running /usr/local/etc/ha.d/rc.d/status status
heartbeat[9374]: 2014/06/19_09:26:49 debug: get_delnodelist: delnodelist=
heartbeat[9374]: 2014/06/19_09:26:50 info: Status update for node vsyslog2: status active
heartbeat[10234]: 2014/06/19_09:26:50 debug: notify_world: setting SIGCHLD Handler to SIG_DFL
harc[10234]:    2014/06/19_09:26:50 info: Running /usr/local/etc/ha.d/rc.d/status status
heartbeat[9374]: 2014/06/19_09:26:50 info: remote resource transition completed.

论坛徽章:
33
荣誉会员
日期:2011-11-23 16:44:17天秤座
日期:2014-08-26 16:18:20天秤座
日期:2014-08-29 10:12:18丑牛
日期:2014-08-29 16:06:45丑牛
日期:2014-09-03 10:28:58射手座
日期:2014-09-03 16:01:17寅虎
日期:2014-09-11 14:24:21天蝎座
日期:2014-09-17 08:33:55IT运维版块每日发帖之星
日期:2016-04-17 06:23:27操作系统版块每日发帖之星
日期:2016-04-18 06:20:00IT运维版块每日发帖之星
日期:2016-04-24 06:20:0015-16赛季CBA联赛之天津
日期:2016-05-06 12:46:59
2 [报告]
发表于 2014-06-19 11:27 |只看该作者
1.x时代的配置 加上 2.x 时代的产品, 用在 3.x 时代.

楼主, 有点过时呀.

论坛徽章:
0
3 [报告]
发表于 2014-06-19 11:44 |只看该作者
回复 2# q1208c

3.X太复杂,我还没搞明白。只做个2节点的HA,2.X配置能简单点
你所说的2代的配置应该是什么样的啊 ?


   

论坛徽章:
33
荣誉会员
日期:2011-11-23 16:44:17天秤座
日期:2014-08-26 16:18:20天秤座
日期:2014-08-29 10:12:18丑牛
日期:2014-08-29 16:06:45丑牛
日期:2014-09-03 10:28:58射手座
日期:2014-09-03 16:01:17寅虎
日期:2014-09-11 14:24:21天蝎座
日期:2014-09-17 08:33:55IT运维版块每日发帖之星
日期:2016-04-17 06:23:27操作系统版块每日发帖之星
日期:2016-04-18 06:20:00IT运维版块每日发帖之星
日期:2016-04-24 06:20:0015-16赛季CBA联赛之天津
日期:2016-05-06 12:46:59
4 [报告]
发表于 2014-06-19 12:39 |只看该作者
回复 3# zhjixi1234

2.x的 配置 和 3.x 是一样的. 你现在用的, 是 1.x 时代的配置.

2.1.4 应该是 2.x 最后一个版本了.

如果使用 crm 好象是不会回来的, 或者可以控制是不是回来.
而且, 还有很多精确的控制.

如果你不想手工去写, 我记得 2.x 带一个 图形的配置管理器. 我用过, 只要不是很特别的配置, 都可以支持的.

   

论坛徽章:
0
5 [报告]
发表于 2014-06-27 10:42 |只看该作者
本帖最后由 kivis 于 2014-06-27 10:49 编辑

从我最近实验结果来看,你这是发生脑裂了,我也遇到了同样的问题。最终又返回研究了下HA的原理,发现问题所在,

从你的ha.cf看,你的心跳配置为 :
ucast eth0 172.16.5.243
这种情况下,你断开公共网络,主备机无法接收到对方的心跳信息,两边无法知道对方的情况,这种,备机会启动资源,
但是你恢复主机网线的时候,互相通信后 发生脑裂,
根据HA的原理:发生脑裂后,主备机HA服务会重启,然后主机接管资源。

你可以装一条单独的心跳线 网口eth1,使用如下配置
#ucast eth0 172.16.5.243
bcast eth1
心跳线一般不允许断开,

然后启用ipfail工具,利用ipfail工具和你配置中的ping工具配合检测公共网络故障,作为主备切换的条件,心跳线作为HA服务器间互相通信,一般不断开。

这样,主机公共网络故障后,自动释放资源。通知备机启用资源,网络恢复后,根据auto_failback配置,决定资源是否切回主机。

参考下面的博文,关于HA故障切换测试应有的反应:
http://ixdba.blog.51cto.com/2895551/747510


您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP